Condensing Mercurial revision history

Condensing Mercurial revision history - mercurial

We have 2,700+ revisions and it takes a good 30-45 seconds to load Mercurial when doing a merge, push or anything else with TortoiseHg. I'm wondering if there's a way other than straight up creating a new repository to clean up the revision history. Say, cut off files under revision 2,400 or so.

Not an answer to your question, but:
Maybe reducing "log batch size" to 100 (default is 500) in the settings helps.
Our 2300+ rev repo loads in 2-3 secs (off my 15k rpm SAS-disk, but never mind that), so I don't think your problem is many revs, really. There are much bigger repos out there. :)
Note that both Mercurial core and TortoiseHg developers are keen on finding performance bugs, so it might be worthwhile to ask on the mail-lists for assistance.

You can use the histedit extension to compress several changesets into one. Executing the histedit command on a range of revisions will spawn a text document that looks like this (from the histedit documentation):
pick c561b4e977df Add beta
pick 030b686bedc4 Add gamma
pick 7c2fd3b9020c Add delta
Edit history between c561b4e977df and 7c2fd3b9020c
Commands:
p, pick = use commit
e, edit = use commit, but stop for amending
f, fold = use commit, but fold into previous commit
d, drop = remove commit from history
Changing pick to fold for a certain changeset in the list above will fold it into the previous changeset. It will give you an opportunity to resolve failed merges and enter a new commit message as well.
WARNING:
Using histedit will modify the repository history, including hash IDs, which will cause problems unless you re-start each developer with a new repository clone after the changes have been made. Also, you would probably need to limit your histedit-ing to changesets with a single parent (ie: non-merge changesets).

Related

How to clone from a specific revision to the last one using Mercurial?

In Mercurial , How to clone from a specific revision to the last one using ?
For example repo A have one line history from changeset 0 to changeset 100. and I want to clone A to my local repo from changeset 90 to last one (100).
Looking through the help, I noticed the -r flag but that only clone 1 specific changeset.
And if there is no way to do it can somebody explain why its not implemented ? its considered a bad thing to do ?
Thanks.

You can't.
The current state of the project is all changesets from the beginning of time up until the specific changeset, you cannot prune older changesets without rewriting the history of the repository to permanently get rid of them. This will also make the repository incompatible with the original that contains the old history.
In short, you will have to do one of the following:
Prune the old history, permanently getting rid of it, which will make it impossible to push/pull with original clones that still has that history
Live with the history
The parameters to the clone command that specifies revsets thus only allow you to set an upper limit. This may allow you to avoid whole branches, if they aren't merged into the branch you end up cloning, but the clone command will always clone everything from the beginning of time.
For every changeset you clone, every predecessor will be cloned as well, and this cannot be avoided.

How to revert a file to an earlier version in Mercurial?

I made some changes to a file and committed it. (In fact there were several commits).
Then I wanted to revert to the earlier version and lose all those changes.
I did something like:
hg update -r nnn where nnn was the reversion number of the changeset I wanted to go back to.
That worked. I was happy.
Then, later, I had to push my local repository to the remote. But when I did hg push I got a message about there being two heads on this branch and one of them not being known to the remote repositiory. It suggested I merge before pushing. (I think).
I googled this and found a page that suggested I do "hg merge". I did that. Now the resultant file is back to where I started. I.e. it contains all the changes I wanted to throw away.
Where did i go wrong?
EDIT:
I have found this post Mercurial — revert back to old version and continue from there
where it says:
If later you commit, you will effectively create a new branch. Then
you might continue working only on this branch or eventually merge the
existing one into it.
That sounds like my case. Something went wrong at the merging stage it seems. Was I on the wrong branch when I did "hg merge"?

You're past this point now but if it happens again, and it's just a single file you want to revert then consider:
hg revert --rev REVISION_YOU_LIKED path/to/just/one/file.txt
That doesn't update you whole repository to a different revision, and it doesn't create any commits. It just takes a single file in your working directory and makes it look like it used to. After doing that you can just commit and you're set.
That's not the way to go if you want to undo all the changes you've made to all files, but for reverting a single file use revert and avoid multiple heads and merging entirely.

No, nothing went wrong at the merge stage – Mercurial did exactly what you asked it to...
What merge means is that you take the changes on your current branch, and the changes on the 'other' branch, and you merge them. Since your original changes were in the 'other' branch, Mercurial carefully merged them back into your current branch.
What you needed to do was to discard the 'other' branch. There are various ways of doing that. The Mercurial help pages discuss the various techniques, but there are pointers in other SO questions: see for example Discard a local branch in Mercurial before it is pushed and Remove experimental branch.
(Edit) Afterthought: the reason you got a warning about there being two heads on the branch is because having two heads is often a temporary situation, so pushing them to a remote repository is something you don't want to do accidentally. Resolutions are (i) you did mean to push them, so use --force to create two heads in the remote repository; (ii) ooops!, you meant to merge them before pushing, so do that; or (iii) ooops!, you'd abandoned the 'other' one, so get rid of it. Your case was (iii).

Mercurial: abandoning multiple, contiguous commits, on the `default` branch

Requirement
I'd like to abandon a line of development on the default branch, winding back to a revision from about 15 change sets back, and have default proceed from there.
My setup
This is a solo development project with one other guy testing infrequently. I push (frequently) to bitbucket for backups and sharing with the tester. Several of the changes I want to abandon are pushed to BitBucket.
Options
Any of these would be fine…
The abandoned change sets to continue to exist in the repo. It would be nice if they could live on their own branch abandoned-experiment-1, say, that I can close and ignore, but this would need them to move on to a retrospectively created branch (which seems like it would be an awesome feature?).
Some kind of merge to happen where I add a new revision to default that is the rollback to the revision I want to continue from.
The change sets to be destroyed, but I suspect there's no way to achieve that without replacing the BitBucket repo and my tester's repo, which I'm not keen on.
I'm not too sure how to evaluate which options are possible, which is best, or whether there are other, better options. I'm also not sure how to actually proceed with the repo update!
Thank you.

You do have several options (Note that I'm assuming that you are dispensing with all changes in the 15 or so revisions and not trying to keep small bits of them):
Easiest is kinda #2: You can close anonymous branches just like named branches; Tag the tip first with abandoned-development if you wish; hg update to the point you wish to continue from; and continue to code as normal. (You may need to create the new head for new development before you can close the old one. I haven't tested it yet.)
Regarding #3: Based on my cursory read, it does appear that bitbucket has a strip command. If you (both locally and on bitbucket) and your tester strip the offending changesets, you can carry on your merry way and pretend like they never existed.
Achieving #1: If you are definitely set on getting them to a named branch, you could strip them at the remote repos and then hg rebase them onto a new branch locally and then close that branch.
Personally, I try not to mess with history when I can avoid it, so I'd go with the easiest.

Mercurial now has (yet experimental) support for changeset evolution. That is you are able to abandon or rebase already pushed changesets. Internally this works by hiding obsolete changesets (i.e. practically nothing is stripped, only new replacement revisions are added to the history, that is why it works across multiple clones).
To extend #Edward's suggestions, you could also update to the last good revision, continue to commit from there and then merge in the head of the bad changesets using a null-merge:
hg up <good-revision>
... work ... commit ...
hg merge <head-of-bad-revisions>
hg revert --all -r .
hg commit -m 'null-merge of abandoned changesets'
This may be what you thought of as option 2.

What to do instead of squashing commits in Mercurial

I've got my IDE set to commit locally every time I save anything. I'd ideally like to keep an uncensored record of my idiot fumblings for the rare occasions they may be useful. But most of the time it makes my history way to detailed.
I'd like to know a good strategy to keep that history but be able to ignore it most of the time. My IDE is running my own script every time I save, so I have control over that.
I'm pretty new to Mercurial, so a basic answer might be all I need here. But what are all the steps I should do when committing, merging, and reporting to be able to mostly ignore these automatic commits, but without actually squashing them? Or am I better off giving up and just squashing?
Related question about how to squash with highly rated comment suggesting it might be better to keep that history
Edit - My point here is that if Mercurial wants to keep all your history (which I agree with), it should let you filter that history to avoid seeing the stuff you might be tempted to squash. I would prefer not to squash, I'm just asking for help in a strategy to (in regular usage, though not quite always) make it look as much as possible like I did squash my history.

You want to keep a detailed history in your repo, but you want to have (and be able to export) an idealized history that only contains "reasonable" revsets, right? I can sympathize.
Solution 1: Use tags to mark interesting points in the history, and learn to ignore all the messy bits between them.
Solution 2: Use two branches and merge. Do your development in branch default, and keep a parallel branch release. (You could call it clean, but in effect you are managing releases). Whenever default is in a stable state that you want to checkpoint, switch to branch release and merge into it the current state of default-- in batches, if you wish. If you never commit anything directly to release, there will never be a merge conflict.
(original branch) --o--o--o--o--o--o--o (default)
\ \ \
r ... ... --r--------r (release)
Result: You can update to any revision of release and expect a functioning state. You can run hg log -r release and you will only see the chosen checkpoints. You can examine the full log to see how everything happened. Drawbacks: Because the release branch depends on default, you can't push it to another repo without bringing default with it. Also hg glog -r release will look weird because of the repeated merges.
Solution 3: Use named branches as above, but use the rebase extension instead of merging. It has an option to copy, rather than move outright, the rebased changesets; and it has an option --collapse that will convert a set of revisions into a single one. Whenever you have a set of revisions r1:tip you want to finalize, copy them from default to release as follows:
hg rebase --source r1 --dest release --keep --collapse
This pushes ONE revision at the head of release that is equivalent to the entire changeset from r1 to the head of default. The --keep option makes it a copy, not a destructive rewrite. The advantage is that the release branch looks just as you wanted: nice and clean, and you can push it without dragging the default branch with it. The disadvantage is that you cannot relate its stages to the revisions in default, so I'd recommend method 2 unless you really have to hide the intermediate revisions. (Also: it's not as easy to squash your history in multiple batches, since rebase will move/copy all descendants of the "source" revision.)
All of these require you to do some extra work. This is inevitable, since mercurial has no way of knowing which revsets you'd like to squash.

it should let you filter that history to avoid seeing the stuff you might be tempted to squash
Mercurial has the tools for this. If you just don't want see (in hg log, I suppose) - filter these changesets with revsets:
hg log -r "not desc('autosave')"
Or if you use TortoiseHg, just go View -> Filter Toolbar, and type in "not desc('autosave')" in the toolbar. Voila, your autosave entries are hidden from the main list.

If you actually do want to keep all the tiny changes from every Ctrl-S in the repo history and only have log show the subset of the important ones, you could always tag the "important" changesets and then alias log to log -r tagged(). Or you could use the same principle with some other revset descriptor, such as including the text 'autosave' in the auto-committed messages and using log -r keyword(autosave), which would show you all non-autosaved commits.
To accomplish your goal, at least as I'd approach it, I'd use the mq extension and auto-commit the patch queue repository on every save. Then when you've finished your "idiot fumblings" you can hg qfinish the patch as a single changeset that can be pushed. You should (as always!) keep the changes centered around a single concept or step (e.g. "fixing the save button"), but this will capture all the little steps it took to get you there.
You'd need to
hg qinit --mq once to initialze the patch queue repo (fyi: stored at \.hg\patches\)
hg qnew fixing-the-save-btn creates a patch
then every time you save in your IDE
hg qrefresh to update the patch
hg commit --mq to make the small changeset in the patch queue repo
and when you are done
hg qfinish fixing-the-save-btn converts the patch into a changeset to be pushed
This keeps your fumblings local to your repo complete with what was changed every time you saved, but only pushes a changeset when it is complete. You could also qpop or qpush to change which item you were working on.
If you were to try the squash method, you'd lose the fumbling history when you squashed the changesets down. Either that or you'd be stuck trying to migrate work to/from the 'real' repository, which, I can tell you from experience, you don't want to do. :)

I would suggest you to use branches. When you start a new feature, you create a new branch. You can commit as many and often as you like within that branch. When you are done, you merge the feature branch into your trunk. In this way, you basically separate the history into two categories: one in fine-grain (history in feature branches), and the other in coarse-grain (history in the trunk). You can easily look at either one of them using the command: hg log --branch <branch-name>.

How do I permanently remove (obliterate) files from history?

I commited (not pushed) a lot of files locally (including binary files removing & adding...) and now when I try to push it takes a lot of time. Actually I messed up my local repo history.
How could I avoid this mistake in the future ? Can I transform a set of local revision 1->2->3->4 to 1->2 with 2 being the final revision of the local clone ?
edit: since I was in hurry I started a new remote repo from scratch with revision 4. In the future I will go with the marked answer as it seems easier but I will dig other solutions to see the truth. Thx for your support.

It's not clear from your question whether those changes got pushed. If they're still local, you can more or less get rid of them easily. convert is one option. You can also use MQ (mercurial queues). Check the EditingHistory wiki article for a detailed explanation. It recommends MQ being the simplest approach.
To prevent that kind of mistakes, you should probably add a hook to reject 'bad' commits, given that you can describe them programatically ;)

Mercurial history is immutable, you can't delete using the normal tools. You can, however, create a new repo without those files:
$ hg clone -r 1 repo-with-too-much new-repo
that takes only revisions zero and one from the old repo and puts them into a new repo. Now copy the files from revision four into the new repo and commit.
This gets rid of those interstitial changesets, but any repo you have out there in the wild still has them, so when you pull you'll get them back.
In general once you've pushed a changeset it's out there and unless you can get everyone with a clone to delete it and reclone you're out of luck.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008