Clean revision history in Mercurial

Clean revision history in Mercurial - mercurial

I'm new with Mercurial and I'm still trying to establish a workflow.
I clone a remote repository. I then make changes and commits to my local repository. I would like to push a clean revision history. I don't want to push some revisions that may only muddy the remote repository (e.g. typo rename of methods, adding of javadoc, removing whitespaces, etc). Is this possible?
Thank you!

Indeed there are cases when it makes sense to prune/strip/polish changesets or to separate private/in-progress and public/ready changes. Here are some suggestion how to handle this with Mercurial:
If you made a series of small commits (which is a good thing), but you prefer to publish them as one changeset (which makes sense if you committed - possibly contradicting - snapshot changesets), use the collapse extension:
$ hg collapse -r <first-ref>:<last-ref> # both revs including
If you just have changes you want to keep private while others you want to publish, either (a) use mercurial queues for you private changes or (b) commit you private-only changes in an own branch (named, bookmarked, or cloned) and regularly merge it with the public branch. The latter requires you to switch between branches quite often if you commit private and public changes alternating.
UPDATE
To illustrate option 2.b, consider this graph of changesets:
0--1--4--5---8--9 <= public/stable branch
\ \
2--3--6--7 <= private/dev branch
This means you've made changes 2 and 3 on the dev branch. Then you did some work on the stable branch (revisions 4 and 5). These changes also make sense in dev, so you merge them into dev (revision 6). Finally you make another experimental change in dev (revision 7) and some ready-to-publish improvements in stable (revisions 8 and 9). You can now publish (i.e. push to the remote repository) the changes made in stable, by running
$ hg push -r 9 # or `-r stable` if the branch is named or bookmarked as such
All private changes will stay local!
Note that if you plan to polish your private commits later (e.g. collapse them to one changeset), you must not merge in the stable branch (collapse cannot work across merges). Instead, rebase you private changes whenever you want them to be in sync with the latest changes in stable.

This is more a question about revision control systems in general than about mercurial. Personally I would push everything, what you're suggesting kinda goes against versioning systems purposes.
However if you want to read more about hg workflows, try having a look at this guide or this wiki.
You could also be interested in the rollback command, that enables you to fix the latest commit.
edit: the option I would go for is to keep two branches: one identical to the remote repository where you commit only "official" changes, and one where you merge new official commits and make additional private commits.

If all the parasitic changes have been made at last, you can simply strip them.
If not, you can try to use the Patch Queue to remove the unnecessary commits:
Import your changes one by one to MQ and unapply the patches
Delete the patches that match the changes you don't want
reapply the remaining patches
This will work for small commits. Please if you try this, do it on a clone of your local repo to avoid loss.
Also, take care not to alter an already 'shared' (pushed) history.

Histedit also can be usable and more easy (than MQ) solution for editing history

Related

Mercurial: devs work on separate folders, why do they have to merge all the time

I have four devs working in four separate source folders in a mercurial repo. Why do they have to merge all the time and pollute the repo with merge changesets? It annoys them and it annoys me.
Is there a better way to do this?

Assuming the changes really don't conflict, you can use the rebase extension in lieu of merging.
First, put this in your .hgrc file:
[extensions]
rebase =
Now, instead of merging, just do hg rebase. It will "detach" your local changesets and move them to be descendants of the public tip. You can also pass various arguments to modify what gets rebased.
Again, this is not a good idea if your developers are going to encounter physical merge conflicts, or logical conflicts (e.g. Alice changed a feature in file A at the same time as Bob altered related functionality in file B). In those cases, you should probably use a real merge in order to properly represent the relevant history. hg rebase can be easily aborted if physical conflicts are encountered, but it's a good idea to check for logical conflicts by hand, since the extension cannot detect those automatically.

Your development team are committing little and often; this is just what you want so you don't want to change that habit for the sake of a clean line of commits.
#Kevin has described using the rebase extension and I agree that can work fine. However, you'll also see all the work sequence of each developer squished together in a single line of commits. If you're working on a stable code base and just submitting quick single-commit fixes then that may be fine - if you have ongoing lines of development then you might not won't want to lose the continuity of a developer's commits.
Another option is to split your repository into smaller self-contained repositories.
If your developers are always working in 4 separate folders, perhaps the contents of these folders can be modularised and stored as separate Mercurial repositories. You could then have a separate master repository that brought all these smaller repositories together within the sub-repository framework.

Mercurial is distributed, it means that if you have a central repository, every developer also has a private repository on his/her workstation, and also a working copy of course.
So now let's suppose that they make a change and commit it, i.e., to their private repository. When they want to hg push two things can happen:
either they are the first one to push a new changeset on the central server, then no merge will be required, or
either somebody else, starting from the same version, has committed and pushed before them. We can see that there is a fork here: from the same starting point Mercurial has two different directions, thus a merge is required, even if there is no conflict, because we do not want four different divergent contexts on the central server (which by the way is possible with Mercurial, they are called heads and you can force the push without merge, but you still have the divergence, no magic, and this is probably not what you want because you want to be able to checkout the sum of all the contributions..).
Now how to avoid performing merges is quite simple: you need to tell your developers to integrate others changes before committing their own changes:
$ hg pull
$ hg update
$ hg commit -m"..."
$ hg push
When the commit is made against the latest central version, no merge should be required.
If they where working on the same code, after pull and update some running of tests would be required as well to ensure that what was working in isolation still works when other developers work have been integrated. Taking others contributions frequently and pushing our own changes also frequently is called continuous integration and ensures that integration issues are discovered quickly.
Hope it'll help.

Mercurial: abandoning multiple, contiguous commits, on the `default` branch

Requirement
I'd like to abandon a line of development on the default branch, winding back to a revision from about 15 change sets back, and have default proceed from there.
My setup
This is a solo development project with one other guy testing infrequently. I push (frequently) to bitbucket for backups and sharing with the tester. Several of the changes I want to abandon are pushed to BitBucket.
Options
Any of these would be fine…
The abandoned change sets to continue to exist in the repo. It would be nice if they could live on their own branch abandoned-experiment-1, say, that I can close and ignore, but this would need them to move on to a retrospectively created branch (which seems like it would be an awesome feature?).
Some kind of merge to happen where I add a new revision to default that is the rollback to the revision I want to continue from.
The change sets to be destroyed, but I suspect there's no way to achieve that without replacing the BitBucket repo and my tester's repo, which I'm not keen on.
I'm not too sure how to evaluate which options are possible, which is best, or whether there are other, better options. I'm also not sure how to actually proceed with the repo update!
Thank you.

You do have several options (Note that I'm assuming that you are dispensing with all changes in the 15 or so revisions and not trying to keep small bits of them):
Easiest is kinda #2: You can close anonymous branches just like named branches; Tag the tip first with abandoned-development if you wish; hg update to the point you wish to continue from; and continue to code as normal. (You may need to create the new head for new development before you can close the old one. I haven't tested it yet.)
Regarding #3: Based on my cursory read, it does appear that bitbucket has a strip command. If you (both locally and on bitbucket) and your tester strip the offending changesets, you can carry on your merry way and pretend like they never existed.
Achieving #1: If you are definitely set on getting them to a named branch, you could strip them at the remote repos and then hg rebase them onto a new branch locally and then close that branch.
Personally, I try not to mess with history when I can avoid it, so I'd go with the easiest.

Mercurial now has (yet experimental) support for changeset evolution. That is you are able to abandon or rebase already pushed changesets. Internally this works by hiding obsolete changesets (i.e. practically nothing is stripped, only new replacement revisions are added to the history, that is why it works across multiple clones).
To extend #Edward's suggestions, you could also update to the last good revision, continue to commit from there and then merge in the head of the bad changesets using a null-merge:
hg up <good-revision>
... work ... commit ...
hg merge <head-of-bad-revisions>
hg revert --all -r .
hg commit -m 'null-merge of abandoned changesets'
This may be what you thought of as option 2.

What to do instead of squashing commits in Mercurial

I've got my IDE set to commit locally every time I save anything. I'd ideally like to keep an uncensored record of my idiot fumblings for the rare occasions they may be useful. But most of the time it makes my history way to detailed.
I'd like to know a good strategy to keep that history but be able to ignore it most of the time. My IDE is running my own script every time I save, so I have control over that.
I'm pretty new to Mercurial, so a basic answer might be all I need here. But what are all the steps I should do when committing, merging, and reporting to be able to mostly ignore these automatic commits, but without actually squashing them? Or am I better off giving up and just squashing?
Related question about how to squash with highly rated comment suggesting it might be better to keep that history
Edit - My point here is that if Mercurial wants to keep all your history (which I agree with), it should let you filter that history to avoid seeing the stuff you might be tempted to squash. I would prefer not to squash, I'm just asking for help in a strategy to (in regular usage, though not quite always) make it look as much as possible like I did squash my history.

You want to keep a detailed history in your repo, but you want to have (and be able to export) an idealized history that only contains "reasonable" revsets, right? I can sympathize.
Solution 1: Use tags to mark interesting points in the history, and learn to ignore all the messy bits between them.
Solution 2: Use two branches and merge. Do your development in branch default, and keep a parallel branch release. (You could call it clean, but in effect you are managing releases). Whenever default is in a stable state that you want to checkpoint, switch to branch release and merge into it the current state of default-- in batches, if you wish. If you never commit anything directly to release, there will never be a merge conflict.
(original branch) --o--o--o--o--o--o--o (default)
\ \ \
r ... ... --r--------r (release)
Result: You can update to any revision of release and expect a functioning state. You can run hg log -r release and you will only see the chosen checkpoints. You can examine the full log to see how everything happened. Drawbacks: Because the release branch depends on default, you can't push it to another repo without bringing default with it. Also hg glog -r release will look weird because of the repeated merges.
Solution 3: Use named branches as above, but use the rebase extension instead of merging. It has an option to copy, rather than move outright, the rebased changesets; and it has an option --collapse that will convert a set of revisions into a single one. Whenever you have a set of revisions r1:tip you want to finalize, copy them from default to release as follows:
hg rebase --source r1 --dest release --keep --collapse
This pushes ONE revision at the head of release that is equivalent to the entire changeset from r1 to the head of default. The --keep option makes it a copy, not a destructive rewrite. The advantage is that the release branch looks just as you wanted: nice and clean, and you can push it without dragging the default branch with it. The disadvantage is that you cannot relate its stages to the revisions in default, so I'd recommend method 2 unless you really have to hide the intermediate revisions. (Also: it's not as easy to squash your history in multiple batches, since rebase will move/copy all descendants of the "source" revision.)
All of these require you to do some extra work. This is inevitable, since mercurial has no way of knowing which revsets you'd like to squash.

it should let you filter that history to avoid seeing the stuff you might be tempted to squash
Mercurial has the tools for this. If you just don't want see (in hg log, I suppose) - filter these changesets with revsets:
hg log -r "not desc('autosave')"
Or if you use TortoiseHg, just go View -> Filter Toolbar, and type in "not desc('autosave')" in the toolbar. Voila, your autosave entries are hidden from the main list.

If you actually do want to keep all the tiny changes from every Ctrl-S in the repo history and only have log show the subset of the important ones, you could always tag the "important" changesets and then alias log to log -r tagged(). Or you could use the same principle with some other revset descriptor, such as including the text 'autosave' in the auto-committed messages and using log -r keyword(autosave), which would show you all non-autosaved commits.
To accomplish your goal, at least as I'd approach it, I'd use the mq extension and auto-commit the patch queue repository on every save. Then when you've finished your "idiot fumblings" you can hg qfinish the patch as a single changeset that can be pushed. You should (as always!) keep the changes centered around a single concept or step (e.g. "fixing the save button"), but this will capture all the little steps it took to get you there.
You'd need to
hg qinit --mq once to initialze the patch queue repo (fyi: stored at \.hg\patches\)
hg qnew fixing-the-save-btn creates a patch
then every time you save in your IDE
hg qrefresh to update the patch
hg commit --mq to make the small changeset in the patch queue repo
and when you are done
hg qfinish fixing-the-save-btn converts the patch into a changeset to be pushed
This keeps your fumblings local to your repo complete with what was changed every time you saved, but only pushes a changeset when it is complete. You could also qpop or qpush to change which item you were working on.
If you were to try the squash method, you'd lose the fumbling history when you squashed the changesets down. Either that or you'd be stuck trying to migrate work to/from the 'real' repository, which, I can tell you from experience, you don't want to do. :)

I would suggest you to use branches. When you start a new feature, you create a new branch. You can commit as many and often as you like within that branch. When you are done, you merge the feature branch into your trunk. In this way, you basically separate the history into two categories: one in fine-grain (history in feature branches), and the other in coarse-grain (history in the trunk). You can easily look at either one of them using the command: hg log --branch <branch-name>.

Mercurial clone cleanup to match upstream

I have a hg clone of a repository into which I have done numerous changes locally over a few months and pushed them to my clone at google code. Unfortunately as a noob I committed a whole bunch of changes on the default branch.
Now I would like to make sure my current default is EXACTLY as upstream and then I can do proper branching off default and only working on the branches..
However how do I do that cleanup though?
For reference my clone is http://code.google.com/r/mosabua-roboguice/source/browse
PS: I got my self into the same problem with git and got that cleaned up: Cleanup git master branch and move some commit to new branch?

First, there's nothing wrong with committing on the default branch. You generally don't want to create a separate named branch for every task in Mercurial, because named branches are forever. You might want to look at the bookmark feature for something closer to git branches ("hg help bookmarks"). So if the only thing wrong with your existing changesets is that they are on the default branch, then there really is nothing wrong with them. Don't worry about it.
However, if you really want to start afresh, the obvious, straightforward thing to do is reclone from upstream. You can keep your messy changesets by moving the existing repo and recloning. Then transplant the changesets from the old repo into the new one on a branch of your choosing.
If you don't want to spend the time/bandwidth for a new clone, you can use the (advanced, dangerous, not for beginners) strip command. First, you have to enable the mq extension (google it or see the manual -- I'm deliberately not explaining it here because it's dangerous). Then run
hg strip 'outgoing("http://upstream/path/to/repo")'
Note that I'm using the revsets feature added in Mercurial 1.7 here. If you're using an older version, there's no easy way to do this.

The best way to do this is with two clones. When working with a remote repo I don't control I always keep a local clone called 'virgin' to which I make no changes. For example:
hg clone -U https://code.google.com/r/mosabua-roboguice-clean/ mosabua-roboguice-clean-virgin
hg clone mosabua-roboguice-clean-virgin mosabua-roboguice-clean-working
Note that because Mercurial uses hard links for local clones and because that first clone was a clone with -U (no working directory (bare repo in git terms)) this takes up no additional disk space.
Work all you want in robo-guice working and pull in robo-guice virgin to see what's going on upstream, and pull again in roboguice-working to get upstream changes.
You can do something like this after the fact by creating a new clone of the remote repo and if diskspace is precious use the relink extension to associate them.

Preface - all history changes have sense only for non-published repos. You'll have to push to GoogleCode's repo from scratch after editing local history (delete repo on GC, create empty, push) - otherwise you'll gust get one more HEAD in default branch
Manfred
Easy (but not short) way - default only+MQ
as Greg mentioned, install MQ
move all your commits into MQ-patches on top of upstream code
leave your changes as pathes forever
check, edit if nesessary and re-integrate patches after each upstream pull (this way your own CG-repo without MQ-patches will become identical to upstream)
More complex - MQ in the middle + separate branches
above
above
create named branch, switch to it
"Finish" patches
Pull upstream, merge with your branch changes (from defaut to yourbranch)
Commit your changes only into yourbranch
Rebasing
Enable rebase extension
Create named branch (with changeset in it? TBT)
Rebase your changesets to the new ancestor, test results
See 5-6 from "More complex" chapter

Perhaps you could try the Convert extension. It can bring a repository in a better shape, while preserving history. Of course, after the modifications have been done, you will have to delete the old repo and upload the converted one.

Mercurial push problem

I've just got a problem with hg push command. What I did - Firstly I created 2 branches hot-fix-1 and hot-fix-2 made some changes in each branche, merged it back to default and closed those branches with the command:
hg commit --close-branch
If I start hg branches I have the following output:
default 29:e62a2c57b17c
hg branches -c gives me:
default 29:e62a2c57b17c
hot-fix-2 27:42f7bf715392 (closed)
hot-fix-1 26:dd98f50934b0 (closed)
Thus hot-fix-* branches seems to be closed. However if I try to push the changes I have the next error message:
pushing to /Users/user1/projects/mercurial/mytag
searching for changes
abort: push creates new remote branches: hot-fix-1, hot-fix-2!
(use 'hg push --new-branch' to create new remote branches)
and it does not matter which command I use hg push -b . or hg push -b default
So the question is how I can push those changes to repository without creating new branches.
P.S I used to work with git and was hoping that similar branching model can be used in Mercurial. Thanks

First, as many others have pointed out, using a named branch for short lived work is not a recommended practice. Named branches are predominantly for long lived features, or for release management.
Given that you are in this situation, there are a few options available. All of them involve modifying history (as you're obviously trying to change something you've done).
One is to just push the branches as is, learn from the experience, and move on. If the rest of the team is fine with this, then it's a case of adding --new-branch to your push command.
If the rest of the team, or you, really want the history to be clean, then you'll need to dig deeper.
If you aren't pushing, then definitely make a clone of your current repo. This way you have a copy of the original work to fall back on.
I see 2 main approaches here. Strip off the merges and rebase your branches onto default. This will get rid of the named branches or graft/transplant your changes. Both will be the same end result, but the implementation is slightly different.
If you merely want to use graft, that is now a built-in function starting with HG 2.0. It replaces the transplant plugin, and is much nicer to work with as it uses your usual merge tool if there are conflicts.
To use it, update to the default branch. Then, use the command:
hg graft -D "2085::2093 and not 2091"
the string after -D is an hg revision selection query. In your case, you'd likely only need '{start}::{end}' where start is the changeset at the start of the branch, and end is the end changeset of the branch (ignoring the merge).
If you did several merges, you'd have to pick and choose the changesets more precisely.
The other option is to strip the final merges, and use the rebase command that is part of the mq plugin.
You'll have to strip your merge changesets to get rid of them, and then update to the tip of the branch you want to keep. Select the start of the first named branch, and do a rebase. This will change the parentage of the branch (if you're familiar with Git, then this is very much like it's rebase).
Then repeat for the second branch. You should now have one long branch with the name default.

Just do the:
hg push --new-branch
It will send over those branches, but they'll be closed on the receiving end too, so no one should be bothered.
See my comment on the question for why Named Branches are best saved for long-lived entities like 'stable' and anonymous branches, bookmarks, or clones are more suitable for short lived things like hot-fixes and new features.

Your hot-fix changes were made on branches. Regardless of whether the branch is active or closed, it does exist.
To push the changes to the server (without rewriting history), you must use the --new-branch option (e.g. hg push --new-branch`).
Since you merged the branches into default, there will still only be one head (as you have already seen in your local repo).
If you really can't live with pushing the branches to the server, then you must rewrite your local history as suggested in Mikezx6r's answer.
In addition to the methods he mentioned, you can also import the changesets into a patch queue and apply them to the tip of your default.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008