Mercurial ignored file causes abort when trying to update to previous revision - mercurial

Here's my scenario: When I initially created my Mercurial repo, I used hg add to add all *.pl *.sh and *.sql scripts to the repo. I later learned how to use the .hgignore file to exclude other files from the repo. One of the files I needed to exclude was a *.sql file that is generated by a script, so it is essentially a data file that constantly changes when the script runs that produces it; thus, I added it explicitly to the .hgignore file a few revisions ago.
Today, I want to update to a prior revision before this *.sql file was added to the .hgignore, so that I can create a branch off of it. However, when I try to update the working directory to this prior version, I get the following error:
a.sql: untracked file differs
abort: untracked files in working directory differ from files in requested revision
I know that one way I could get around this problem is to delete the file before trying to update to the prior revision, either by manually deleting it or using hg update --clean --check.
That may work in this particular case, since the file is auto-generated by a script each time, and so I don't care about the data that is currently in it.
However, I'm trying to find out what is the safe way people would generally handle this situation when they decide to ignore a file set (like a set of data files that aren't auto-generated) and need to return to a previous revision before they were marked to be ignored, especially if they wanted to retain the most current content in those file sets while still being able to view earlier revisions of files that Mercurial is actively tracking.
I've also considered that you could backup the files, but I think that is only a reasonable solution if this is a one-off case. If you want the ability to hg update to previous revisions on a frequent at-whim basis, then it becomes quite tedious to backup the data each time before you update to a earlier revision (it's also not a reliable way to guarantee that others may not delete the data that isn't being tracked in the repo).
Thanks for the help.

However, I'm trying to find out what is the safe way people would generally handle this situation when they decide to ignore a file set (like a set of data files that aren't auto-generated) and need to return to a previous revision before they were marked to be ignored, especially if they wanted to retain the most current content in those file sets while still being able to view earlier revisions of files that Mercurial is actively tracking.
It depends.
If you have exclusive control over this repository, and have the practical ability to require everyone to re-clone from it, then you can use hg convert to exclude the files from the old revisions. This is by far the cleanest option, but it will change the revision identifiers (hashes) for those revisions and all of their topological descendants. This is why everyone has to re-clone; their old clones will not interact properly with the new repository.
If you can't do that, you can copy the files somewhere else (you do have backups already, right?), clobber the originals with the old versions, and then restore them from your copy. This has to be done whenever you check the files out, so it is definitely suboptimal. You may be able to make this slightly easier by keeping the files outside the repository and checking in symlinks to the files, but you'll still have to fix up the symlinks whenever you checkout an old version.
However, what you describe is not the normal use case for Mercurial. Typically, untracked files are autogenerated, or at least able to be regenerated from tracked files. The operating assumption is that untracked files are not important and can be discarded at any time. Mercurial doesn't actually do this, because that would be rude, but neither does it make any special effort to preserve them when (for example) you make a bundle of the repository.
If you need to deal with versioning of object files, it is typical to store them in a separate artifact repository or some other system. This can be more difficult to manage because you have to reunite the binaries with the source code when you do a build. But it is much more robust than keeping the binaries loose in the repository and hoping they won't get accidentally overwritten or deleted.
Another option is to collapse the binary to text and then place the text under version control. This is always possible (e.g. take a hexdump) but may or may not be practical or reasonable, depending on the file format. For a compressed file format (e.g. tarballs, most image files, etc.) the hexdump is not going to be any easier to 3-way merge than the original binary, so there's little point in it. Similarly, if the binary is huge, the hexdump will be huge too. On the other hand, if a binary is compiled from source code, it is entirely normal to store the source and discard the binary. For something structured like an SQLite database, you might try storing an SQL script which will generate the database. For a zip file or tarball, store the contents. And so on. All of these things can be regenerated using make or a similar tool whenever you check things out, and you can automate this with a repo hook.

Related

Using mercurial, I added a new file and wrote code in it, then deleted that file. Can I retrieve it?

Pretty much the title. I've looked at a lot of similar questions asked here, and I can't seem to find something that applies.
Started by syncing with HEAD. Created a few new files. Filled in those files, they were being tracked at this point. I then not only deleted the files, but also removed them from being tracked (because of stupid UI). According to my understanding, those files are gone for good, but I thought I'd check with people who are smarter than me: Is it possible to retrieve them?
Mercurial does not store uncommitted changes, so if you did not commit the files then they are lost.
If you did commit them, then hg update -C will restore them (and all other files --- make sure there are no other changes you haven't committed and want to keep) to the latest commit for your working dir.

Mercurial: removing files from local repository to save space

I inherited a Mercurial repository that includes some directories with binary files.
All of the files in the aforementioned directories were removed from Mercurial a long time ago, and they will never be used again.
I've read that it is difficult to remove files permanently from Mercurial due to changeset hashes.
Is there any way to remove the data, either only from my local repository, or, preferably, from the bitbucket repository, too?
I don't care if the history of these files can remains in Mercurial, I just don't want to waste storage for them.
If I can't remove the files nicely from bitbucket, can I just delete the apropos .i files under .hg/store/data in my local repository? I'd retain the .i files in bitbucket. Would this cause any problems with my local repository, either by itself, or when pulling from or pushing to bitbucket? No one will ever clone from my local repository.
Thanks.
You can't really remove the data, but you can convert and filter a repository with the Convert extension. In your case, convert from Mercurial to Mercurial. See the --filemap option.
I don't think that counts as removing the data, because you end up with a new repository and new changeset ids. But you do end up with a new repository that excludes whatever you told it to exclude.
If you're the only developer, it seems safe to me to replace the existing bitbucket repository with your new one. But I'd test that on a scratch repository, myself. If you're not the only developer, take some time to plan the conversion. (And make sure it's worth the trouble. It's usually not--disk is cheap.)

A practical way to provide code updates via Mercurial without sharing main BitBucket account

I suspect this might be really obvious but I can't find a straightforward solution in the documentation or forums:
I have written some code that is held in a Mercurial repository on BitBucket.
I use this code to build Linux virtual servers. When I build a server, I clone the repo onto the server, run my build script, and then delete the clone. The result is a configured server with several files from my repo located in various folders on the server.
Now, I'm looking for a mechanism where I can roll out bug fixes and improvements to my users' servers after I have handed them over. At that time, I won't have SSH access to the servers and I cannot expect my end users to do anything more complicated than kick off a cron job or launch a script.
To achieve this, I'm thinking of setting up a BitBucket account for my users with read-only access to my repo.
I have no problem writing a script to clone my repo, via this read-only account, and apply the updates, but I don't want to include all my files. In particular, I want to exclude my build script as it is commercially sensitive. I know I could remove it from my repo, but then my build wouldn't work.
Reading around, it seems I may need to create a branch or a fork of my repo (which?). Or maybe a sub-repo? Then, I could remove the sensitive files from that branch/fork/sub-repo and allow my users to clone it via a script.
That's OK, but I need a way to update the branched/forked/sub repo as I make changes to the main one. Can this be automatic? In other words, can it be set up to always reflect the updates made in the main repo? Excluding the sensitive files of course.
I'm not sure I'd want updates to be automatic though, so I'd also like to know how to transfer updates from the main to the branch/fork/sub manually. A merge? If I do a merge, how do I make sure my sensitive files don't get copied across?
To sum up, I have a main repo which contains some sensitive files and I need a way to roll out updates of all but those sensitive files to my read-only users.
Sorry if this is hugely obvious. I'm sure it's a case of not seeing the wood for the trees and being overwhelmed by the possibilities. Thanks.
I don't think that you need to solve this in Mercurial at all.
What you actually need is Continuous Integration / a build server.
The simplest solution goes like this:
Set up a build server with something like TeamCity or Jenkins, that's always online and monitors changes in your Bitbucket repository.
You can set it up so that when there's a change in your repository, the build server runs your build script and copies the output to some FTP server, or download site, or whatever.
Now you have a single location that always contains the most recent code changes, but without the sensitive files like the build script.
Then, you can set up a script or cron job that your end users can run to get the newest version of the code from that central location.
You are ok with two branches, one for the users clone (main) and other for your main development (dev), the tricky part is merging the new changes from dev to main.
You can solve this by excluding files in the merge process. Excluding a file while merging in Mercurial
By setting the [merge-patterns] section in your .hgrc you can sepcify what files are not affected by the merge.
[merge-patterns]
build.sh = internal:local
For more info read hg help merge-tools.
"internal:local"
Uses the local version of files as the merged version.
Entire Mercurial trees always get moved around together, so you can't clone or pull just part of a repository (along the file tree axis). You could keep a branch that has only part of the files, and then keep another branch that has everything, making it easy to merge the the partial (in terms of files) branch into the other branch (but merging the other way wouldn't be particularly easy).
I'm thinking maybe subrepositories work for your particular use case.

Pushing/Pulling specific files/folders in Mercurial

I am (still) trying to completely migrate our company's SVN to HG.
For the most part I've succeeded, but we ran across a problem.
Our codebase has over 30 different projects, each one on its folder.
I've been asked multiple times how to commit and then push specific files to our central repository instead of being forced to commit everything everywhere to then push it, it's certainly annoying. Not being able to pull only specific projects is also an nuisance.
Is there any way to handle this like we used to in SVN? Where we could just commit what we wanted and not everything, and update only what was necessary.
Thank you.
A major difference between SVN and Mercurial is that you should have one repository per project in Mercurial.
You can change your repository to be multiple repositories using the convert extension.
Like Steve Kaye said you should create one repo per project, but as well you may want to create one master repo and include all your projects as subrepos This will allow svn like behavior of getting a copy of everything.

How can I commit a set of files only once in Mercurial?

I have some files I'd like to add to have them as a "backup". The thing is, I'd like to commit them only one time, and then, I'd like for Mercurial to don't track them anymore ( don't notify me if they're changed, and don't commit them on other commits ).
Basically, something like this:
hg add my_folder
hg commit -m "added first version of my_folder"
Then, after a while, the contents of that folder might change. And if I commit other files, the new version of that folder will get commited as well. This is something I'd like to avoid. Is it possible, without specifying directly which files I want to commit?
I've never seen any option in Mercurial that might allow that... but why not simply copy them elsewhere ?
I mean, what's the point of using a Version Tracking System if you don't need versioning on these items anyway ?
We ran into a similar case with binary documents ('.doc', images, etc...) and finally decided to commit them on a separate repository, dedicated to those.
I think the traditional way of doing this is to commit files named something like "file.ext.default", and just inform users that they should copy the defaults and modify the copies.
VCSs aren't backup sysytems. consider using a proper backup mechanism.
having said that you should be able to do this using hooks, there are many ways you could do this but ACLs would be an obvious one assuming a remote server