Why don't mercurial file sets work when adding files? - mercurial

I'm trying to use mercurial file sets to add all the files in a directory tree, excluding very large files and any binary files. Cribbing from the mercurial documentation, this command should do it:
hg init
hg add 'set: size("<1M") and not binary()'
However this returns a status code of 0, and hasn't added anything to my new, empty repo. I've tried just 'set: not binary()' and that didn't work either.
The frustrating thing is that although I can google for mercurial file sets, and find lots of examples, I can't find anything to help troubleshoot when it doesn't work!
I don't have a .hgignore file, and it's a fresh empty repo. Mercurial 4.2.2.
The directory where I'm testing this has a couple of artificially created files for the purpose of testing. In my real use case, I inherit a multi-gigbyte tarball of assorted sources and binaries from a client, and I want to get all the sources into mercurial before I start hacking to fix their problems, hence the need to exclude the binaries and large files that otherwise choke mercurial.
Here's my little test script:
#!/bin/sh -ex
dd if=/dev/urandom of=binary_1k bs=1 count=1024
dd if=/dev/urandom of=binary_2M bs=1 count=2097152
echo "This. Is, a SMALL text file." > text_small
hexdump binary_1k > text_1k
hexdump binary_2M > text_2M
ls -lh
file binary_1k
file binary_2M
file text_1k
file text_2M
hg init
hg add 'set: size("<1M") and not binary()'
hg status -a
hg add 'set: not binary()'
hg status -a
hg add 'set: size("<1M")'
hg status -a
At the end of this, each status command reports no files in the repo, and the add commands report no errors.

The problem is that file sets do a query of Mercurial's repository data base, which knows only about files that are part of the repository or have been added.
One solution is to add all, and then to get rid of the files that you don't like, e.g.:
hg forget 'set:size(">1M") or binary()'
This works, because the query also requires recently added files, even if they haven't been committed yet.

Related

Undo an accidental hg strip?

I have accidentally run hg strip, and deleted a stack of commits. I have not done anything in the repo since. Is there a way for me to bring back this stack of commits, to undo the hg strip I just ran?
As long as you didn't run the strip with the --no-backup option, the stripped changesets can be found in the repository under .hg\strip-backup. If you sort the directory content by date the latest one is likely the one you need to restore. Restore it with hg unbundle <filename>.
It is possible to hg pull from a strip backup file as an alternative to using hg unbundle.
As noted in a comment on another answer to this question, hg unbundle has fewer options and only works with bundles, but can unbundle more than one bundle at a time. Whereas hg pull can pull from a single source (share/web/bundle) and has other options.
Here's an example of using hg pull based on an external post by Isaac Jurado:
Usually the backup is placed in REPO/.hg/strip-backup/. See the
example below:
$ hg glog
# changeset: 2:d9f98bd00d5b tip
| three
o changeset: 1:e1634a4bde50
| two
o changeset: 0:eb14457d75fa
one
$ hg strip 1
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
saved backup bundle to
/Users/hchapman/ttt/.hg/strip-backup/e1634a4bde50-backup.hg
And then, what one would do to recover those changesets would be:
$ hg pull $(hg root)/.hg/strip-backup/e1634a4bde50-backup.hg
Here is a worked example of unbundle from an external post. I've cleaned it up slightly to make it a little more general:
Recovering stripped files when using Mercurial
If you accidentally strip a patch and do not have a backup for it, you
can still recover your files using Mercurial. To recover your files:
Open a Microsoft Windows Command Prompt window.
Navigate to the project folder where you stripped the files.
Run the dir command
Navigate to the .hg folder where Mercurial stores all relevant project
files.
Run the dir command again.
Navigate to the strip-backup folder where Mercurial stores the backup
bundles of stripped patches.
Run the dir command again. Multiple files display in the directory
that use the <hash>-hg format. They are the backup bundles of stripped
patches.
Use Windows Explorer to find the required file. Open the strip-backup
folder in Windows Explorer, and sort by Date modified descending.
Unless the necessary backup bundle is already known, [it is recommended to]
restore the bundles in reverse chronological order starting
from the most recent bundle.
Navigate back to the project folder.
To restore a bundle, run hg unbundle .hg\strip-backup\<bundle_file_name>. ... You may want to add it to the
PATH environment variable to make it accessible globally.
Synchronize the project [using hg pull] to see the restored patch. If
the restored patch is not the one needed, then continue restoring the
patches in reverse chronological order until the required patch is
retrieved.
Note: You may restore the backup bundles in any order, instead of
using reverse chronological order. However, it may not be safe to do
so. You may end up attempting to restore a backup bundle, which has a
dependency on another backup bundle that has not been restored. In
this case, you will get an error.

Mercurial - Always take local for files matching on a specific folder

I need to set up Mercurial HG in order to always take local version of a file that matches in a specific folder.
EG: when conflict on /**/dist/ always take local
This because I need to commit some built files.
Thanks in advance
EDIT:
I need to commit some files generated by some processors (libsass, webpack), it depends on a temporary unavailability of my build-system. So, I removed these files from the hgignore. Now, the problem that I'm having is on Mercurial conflicts on these generated files. I want automate the merge-resolving using the local version of these files. similar to: How to keep the local file or the remote file during merge using Git and the command line? but for Mercurial HG
You can put a merge-pattern in your ~/.hgrc or .hg/hgrc to specify the default tool for a merge for a given file:
[merge-patterns]
**/dist/* = :local
The :local merge tool will prioritize the local version. See hg help merge-tools for a full list of internal merge tools. Note that using the --tool option during a merge will override this choice; however, setting the ui.merge option to define your default merge tool will not.
The **/dist/* pattern may or may not be what you need. Please adjust it to your needs (and note that regular expression patterns are also available for additional flexibility if required).
Alternatively, you can also automatically resolve these files after the merge, e.g. with:
hg resolve --tool :local $(hg files -I '**/dist/*')
Or, if the list of files is too large to fit on the command line:
hg files -0 -I '**/dist/*' | xargs -0 hg resolve --tool :local

Mercurial creates unversioned copies of files during update to different branch

A little while ago I noticed that hg started creating unversioned copies of files in the repository at seemingly random times when I update between branches. I can't for the life of me think of what I might have changed for this to start happening. There is nothing in the verbose or trace output to indicate that these files are being created.
The new unversioned filenames all end with what seems to be a random string added to the end of the extension:
file1.txt-23121dd1
someotherfile.sql-bc769bd2
bizarrofile.cs-40a93ed0
hgisinvadingurhead.ppt-f8e9015a
When trying to determine the pattern of this happening I've noticed the following:
The added characters in the filenames do not correspond with any changeset ID in the repository. I have done a grep -i to the output of hg history and the string in the filename does not appear anywhere in the output.
In all cases the files existed in the branch I was working on but do not exist in the branch I update to.
Sometimes it's only one or two files, sometimes it's several.
It is never the case that these are all of the files that exist in one branch but not the other.
It is never the case that it is the same set of unversioned files between updates.
Others on my team who are cloning the same repositories do not seem to be experiencing this
I thought maybe it was something within the repository but it also happens in other existing repositories and in brand new ones as well.
For example, I have done this (hg output omitted except for hg status output at the end, but no errors come from the output):
c:\> mkdir repo
c:\> cd repo
c:\repo\> hg init
c:\repo\> echo default > default.txt
c:\repo\> hg add
c:\repo\> hg commit -m "Commit default"
c:\repo\> hg branch branch1
c:\repo\> echo branch1 > branch1.txt
c:\repo\> hg add
c:\repo\> hg commit -m "Commit branch1"
c:\repo\> hg update default
c:\repo\> hg status
? branch1.txt-23121dd1
This is not repeatable every time. I could repeat these steps and sometimes the unversioned file will be there at the end and sometimes it won't. It's very sporadic. In larger repositories, though, I almost always see at least one unversioned file between branch updates.
Full output of hg update default follows. The output always displays as such whether or not the unversioned file is created.
resolving manifests
calling hook preupdate.eol: <function preupdate at 0x0000000002571668>
removing branch1.txt
0 files updated, 0 files merged, 1 files removed, 0 files unresolved
I was using an older version of hg when I first noticed it but the problem still exists after updating to 2.3.2. I am using Windows 7 Pro x64 with TortoiseHG 2.5.1 x64. I don't think it's related to Tortoise, however, because I can replicate the problem by just using hg from the command line.
The contents of my mercurial.ini file are:
[ui]
username=myname <myname#mydomain.com>
ignore=C:\users\myusername\.hgignore
verbose=true
trace=true
[eol]
native = CRLF
only-consistent = False
[extensions]
purge =
eol =
I can live with it, but it's a pain to make sure I'm not accidentally adding these files to the repository in changesets with other new files.
If someone has seen this and could point me to the culprit I'd be most appreciative!
If a file is in use when updating between changesets, the in-use file is renamed with the added numbers so the update can succeed.
Does disabling the eol extension help matters? I noticed that your test did not use a .hgeol file as well (that's one of the things associated with this extension). There's another thread hereabouts that is dedicated to some problems with this extension.

hg remove -I PATTERN, how it works?

Ho to remove all *.bak or *.orig files in mercurial?
example:
C:\dev\web>hg stat
? Views\System\UnderConstruction.cshtml.bak
? Views\Topic\Index.cshtml.bak
? Views\Topic\MasterPage.cshtml.bak
? Web.config.bak
C:\dev\web>hg rem -I *.bak
abort: no files specified
hg remove only removes files that have already been committed. AFAIK, there is no command in mercurial to remove untracked files.
To learn how file patterns work in mercurial, run hg help patterns.
Untracked files ("?" sign) can be removed by OS, not Mercurial
You have to leave files as is, just add patterns to .hgignore and after it files, matching patterns, will not apper in hg status anymore
Correct remove command for remove tracked bak and orig files will be hg remove -I **.bak -I **.orig
You should take a look at the hg purge extension:
Delete files not known to Mercurial. This is useful to test local and
uncommitted changes in an otherwise-clean source tree.
This means that purge will delete:
Unknown files: files marked with "?" by "hg status"
Empty directories: in fact Mercurial ignores directories unless they contain files under source control management
But it will leave untouched:
Modified and unmodified tracked files
Ignored files (unless --all is specified)
New files added to the repository (with "hg add")
If directories are given on the command line, only files in these
directories are considered.
Be careful with purge, as you could irreversibly delete some files you
forgot to add to the repository. If you only want to print the list of
files that this program would delete, use the --print option.
You can do the following two commands:
D:\workspace>hg purge -I **/*.orig --all
and then:
D:\workspace>hg purge -I **/*.bak --all
Tracked files won't be deleted, but I'm guessing that's not an issue for you. Make sure that you enable the purge extension before running this, and you can do dry runs with the --print argument.

File in repository after clone, but no history

We have a Mercurial repository converted from Subversion a while ago and have today noticed that there are files in the repository that have no history whatsoever.
One of the sympomts of this behaviour is that hg status reports the file as clean, while hg log reports no changesets for the same file:
> hg clone [repo]
> hg st -c FileWithMissingHistory.cs
C FileWithMissingHistory.cs
> hg blame FileWithMissingHistory.cs
FileWithMissingHistory.cs: no such file in rev [...]
> hg log FileWithMissingHistory.cs
> hg log FileWithMissingHistory.cs -f
abort: cannot follow nonexistent file: "FileWithMissingHistory.cs"
> hg log -v | grep FileWithMissingHistory.cs
[gives output, there arechangesets mentioning the file]
Obviously the filenames have been changed in the example. I've tried using hg verify, but this command reports that the repo is fine. Has anyone experienced this and is there anything we could do to bring the history "back to life"? Placing dummy history on the files in question would be acceptable, but suboptimal.
EDIT:
I've done some more investigation and noticed that "FileWithMissingHistory.cs" was renamed from another filename (hg copy + delete) in revision 238. If I do hg update -r238 and hg log on the file at this revision I do not get any history. Doing hg log on the original file reports the history as expected, so it seems that the history is somehow lost during copy (again, the file is renamed using hg copy, and the changeset clearly indicates that the file has been copied).
Sounds strange, actually impossible. What I would try to debug this issue is to update to different revisions and check at which revision the file appears in the working copy the first time. If you do this in a binary search fashion (similar to how the bisect extension works), you should find a revision which introduces the file after a few updates.
This does not solve the problem, but it may help in tracking down its source.
I've finally tracked down the cause of the effects mentioned above and it seems that this is caused by mixed casing issues. Some of the files are located in directories with lowercase names while others are located in the directories with equal names, only that the case is mixed (e.g. "directory/FileWithHistory.cs" and "DiReCtOrY/FileWithMissingHistory.cs"). On Windows, both files will be located in the same directory causing issues.