How do I permanently remove (obliterate) files from history? - mercurial

I commited (not pushed) a lot of files locally (including binary files removing & adding...) and now when I try to push it takes a lot of time. Actually I messed up my local repo history.
How could I avoid this mistake in the future ? Can I transform a set of local revision 1->2->3->4 to 1->2 with 2 being the final revision of the local clone ?
edit: since I was in hurry I started a new remote repo from scratch with revision 4. In the future I will go with the marked answer as it seems easier but I will dig other solutions to see the truth. Thx for your support.

It's not clear from your question whether those changes got pushed. If they're still local, you can more or less get rid of them easily. convert is one option. You can also use MQ (mercurial queues). Check the EditingHistory wiki article for a detailed explanation. It recommends MQ being the simplest approach.
To prevent that kind of mistakes, you should probably add a hook to reject 'bad' commits, given that you can describe them programatically ;)

Mercurial history is immutable, you can't delete using the normal tools. You can, however, create a new repo without those files:
$ hg clone -r 1 repo-with-too-much new-repo
that takes only revisions zero and one from the old repo and puts them into a new repo. Now copy the files from revision four into the new repo and commit.
This gets rid of those interstitial changesets, but any repo you have out there in the wild still has them, so when you pull you'll get them back.
In general once you've pushed a changeset it's out there and unless you can get everyone with a clone to delete it and reclone you're out of luck.

Related

Teaching a mercurial repository about a bad rename after it is pushed

We have moves and renames of files in our published mercurial history that were not properly recorded, so that they appear in the history as unrelated deletions and adds.
Is there any way to tell the repository about the connections so that --follow commands can work again?
(For non-pushed changes, here is a question discussing how to get mercurial to properly record moves/renames before you commit, as well as a useful tip here.)
One solution that works but is a bit brutal: You can login remotely on your central Mercurial server, fix the renames there as a local change and then ask everyone to clone the repo again.
This works since all Mercurial repos are equal. You just think of one as "central" but in fact it's a repo like any other. So if you have access to that, you can rewrite history there. The drawback is of course that every developer will notice, so they'll have to export any non-pushed changes they made, clone the repo again and then import the patch.
[EDIT] A possible workaround would be to create a new branch just before you did the bad rename, rename the files properly and then cherry pick all the changes after that into the new branch.
I'm not sure if it's a good idea to merge this branch into your original branch. If you did, then Mercurial would see two kinds of file renames in the past and I'm not sure which one it would follow.
So before you merge, I suggest that you create a small test/demo repo where you reproduce the situation and then try it out.
You can start a new head from before the rename, do the rename properly, then merge it into the head from the upstream server. It will give you a conflict. When this happens, revert the offending files to the revision from your new head (with the good renames; hg revert -r <rev> <files...>).

Mercurial: abandoning multiple, contiguous commits, on the `default` branch

Requirement
I'd like to abandon a line of development on the default branch, winding back to a revision from about 15 change sets back, and have default proceed from there.
My setup
This is a solo development project with one other guy testing infrequently. I push (frequently) to bitbucket for backups and sharing with the tester. Several of the changes I want to abandon are pushed to BitBucket.
Options
Any of these would be fine…
The abandoned change sets to continue to exist in the repo. It would be nice if they could live on their own branch abandoned-experiment-1, say, that I can close and ignore, but this would need them to move on to a retrospectively created branch (which seems like it would be an awesome feature?).
Some kind of merge to happen where I add a new revision to default that is the rollback to the revision I want to continue from.
The change sets to be destroyed, but I suspect there's no way to achieve that without replacing the BitBucket repo and my tester's repo, which I'm not keen on.
I'm not too sure how to evaluate which options are possible, which is best, or whether there are other, better options. I'm also not sure how to actually proceed with the repo update!
Thank you.
You do have several options (Note that I'm assuming that you are dispensing with all changes in the 15 or so revisions and not trying to keep small bits of them):
Easiest is kinda #2: You can close anonymous branches just like named branches; Tag the tip first with abandoned-development if you wish; hg update to the point you wish to continue from; and continue to code as normal. (You may need to create the new head for new development before you can close the old one. I haven't tested it yet.)
Regarding #3: Based on my cursory read, it does appear that bitbucket has a strip command. If you (both locally and on bitbucket) and your tester strip the offending changesets, you can carry on your merry way and pretend like they never existed.
Achieving #1: If you are definitely set on getting them to a named branch, you could strip them at the remote repos and then hg rebase them onto a new branch locally and then close that branch.
Personally, I try not to mess with history when I can avoid it, so I'd go with the easiest.
Mercurial now has (yet experimental) support for changeset evolution. That is you are able to abandon or rebase already pushed changesets. Internally this works by hiding obsolete changesets (i.e. practically nothing is stripped, only new replacement revisions are added to the history, that is why it works across multiple clones).
To extend #Edward's suggestions, you could also update to the last good revision, continue to commit from there and then merge in the head of the bad changesets using a null-merge:
hg up <good-revision>
... work ... commit ...
hg merge <head-of-bad-revisions>
hg revert --all -r .
hg commit -m 'null-merge of abandoned changesets'
This may be what you thought of as option 2.

Is there any way of deleting a branch that has been pushed to a repo shared by all developers?

I have accidentally pushed a branch to a repo. Is there anyway I could alter the repo ( and remove the branch )? Closing it is not a solution.
You got a couple of options, none of them easy, and none of them will leave you with a "phew, saved by the bell" feeling afterwards.
The only real way to fix this problem is to try to avoid it in the first place.
Having said that, let's explore the options here:
Eradicate the changesets
Introduce further changesets that "undo" the changes
The first option, to eradicate the changesets, is hard. Since you pushed the changesets to your central repository, you need direct access to the repositories on that server.
If this is a server where you don't have direct access to the repositories, only through a web interface, or through push/pull/clone, then your option is to hope that the web interface have methods for eradicating those changesets, otherwise go to option 2.
In order to get rid of the changesets, you can either make a new clone of the repository with the changesets, and specify options that stop just shy of introducing the changesets you want to get rid of, or you can use the MQ extension and strip the offending changesets out.
Either is good, but personally I like the clone option.
However, this option hinges on the fact that any and all developers that are using the central repository either:
Have not already pulled the offending changesets from the central repository.
Or are prepared to get rid of said changesets locally as well.
For instance, you could instruct all your developers to kill their local clones, and reclone a fresh copy after you have stripped away the changesets in the central repository.
Here's the important part:
If you cannot get all developers to help with this, you should drop this line of thought and go to option 2 instead
Why? Because now you have two problems:
You need to introduce barriers that ensure no developers can push the same changesets onto the server again, after you got rid of them. Note that relying on the warning by the server to prevent new branches being pushed is perhaps not good enough, as developers might have branches of their own they want to push, and thus not notice that they'll be pushing yours as well.
Any work any developer has done based on any of the offending changesets must either be rebased to a new place, or eradicated as well.
In short, this will give you lots of extra work. I would not do this unless the offending changesets were super-critial to get rid of.
Option 2, on the other hand, comes with its own problems, but is a bit easier to carry out.
Basically you use the hg backout command to introduce a new changeset that reverses the modifications done by the offending changesets, and commit and push that.
The problem here is that if at some point you really want to introduce those changesets, you will have to fight a bit with Mercurial in order to get the merges right.
However, there will be no more work for your fellow developers. The next time they pull, they'll get your correction changeset as well.
Let me just restate this option in different words:
Instead of getting rid of the changesets, keep them, but introduce another changeset that reverses the changes.
Neither option is good, both will generate some extra work.
We've ran into a similar problem once, when we had to remove a branch from the server repo from which all devs regularly pull. Backout wasn't an option because the problematic branch had already been pulled by everyone.
We stripped (hg strip from the MQ extension) the branch in the server repo. From now on, if a developer tried to push, he had a message “push creates new remote branches”, even though they didn't actually created any. We created a batch file with the strip command, distributed it among the devs and explained the “new remote branches” is a signal to run the batch file.
This approach takes some time and effort before everybody gets rid of the branch, but it works.
If the 'backout' option described in Jason's comment above doesn't do it for you, you can remake the repo up until the point of your mistaken push using hg convert, which (despite its name) also works with hg.
eg hg convert -r before-mistaken-push /path/to/original /path/to/new
You might have to play with the usebranchnames and clonebranches settings.

Are deleted files still downloaded with an hg clone?

In our Mercurial repo we added a really big file (and did an hg push), then deleted the big file (and did another push).
Now if someone does an hg clone will they still pull down that big file? I know it won't appear in their working directory as it was deleted, but will the file still be pulled down and stored in Mercurial internal storage?
I'd like to ensure people don't have to pull down the file. I've learned that really big files should be stored outside of Mercurial, so I deleted the file. But I was wondering if people will still be pulling down the big file - in which case I guess I will recreate the repository from scratch.
Of course it will still be in the repository.
You can always update back to older revisions, and if you update back to the revision you got when you committed the file, it'll be there in all its glory.
There are two ways to mitigate this (when you're committing, not now):
One of the big-files extensions, these essentially add big files to a secondary repository and link the two, so that if you update to a revision where the file doesn't exist, and you don't already have it, it will not get updated. ie. it's more a "on-demand" style of pulling
If the file never changes, keep it available on the network and just create some kind of link to it instead of a full copy
Right now, you got four options:
Strip away the changeset that added the file, and all the changesets that came after it. You can do that using the Mercurial Queues extension. Note that you need to do this stripping in all clones. If just one of your users push back the repository that has that file in its history to the central clone, you have the changesets back.
Rebuild the repository from scratch manually
Using the hg convert command and some filtering, the --filemap option can be used for this
Leave it as is. How big is it, will be much of a problem?
Note that rebuilding the repository, either manually or through hg convert will invalidate all clones. Anyone trying to push to your new central clone from an old clone will get a message about unrelated repositories. If any of your users are stupi^H^H^H^H^Hnot smart enough to realize that forcing the push is a bad idea, then you will have problems with this approach.
Yes, the file is still in the history. If you want to delete it completely, you need to use Mercurial Queues — see Editing History on Mercurial wiki.
Just keep in mind this breaks clones as revision IDs change.

Using mercurial's mq for managing local changes

I have a local mercurial repository with some site-specific changes in it. What I would like to do is set a couple files to be un-commitable so that they aren't automatically committed when I do an hg commit with no arguments.
Right now, I'm doing complicated things with mq and guards to achieve this, pushing and popping and selecting guards to prevent my changes (which are checked into an mq patch) from getting committed.
Is there an easier way to do this? I'm sick of reading the help for all the mq commands every time I want to commit a change that doesn't include my site-specific changes.
I would put those files in .hgignore and commit a "sample" form of those files so that they could be easily recreated for a new site when someone checks out your code.
I know that bazaar has shelve and unshelve commands to push and pop changes you don't want included in a commit. I think it's pretty likely that Mercurial has similar commands.
I'm curious why you can't have another repo checked out with those specific changes, and then transplant/pull/patch to it when you need to update stuff. Then you'd be primarily working from the original repo, and it's not like you'd be wasting space since hg uses links. I commonly have two+ versions checked out so I can work on a number of updates then merge them together.