Doing without partial commits the "Mercurial way" - mercurial

Subversion shop considering switching to Mercurial, trying to figure out in advance what all the complaints from developers are going to be. There's one fairly common use case here that I can't see how to handle.
I'm working on some largish feature, and I have a significant part of the code -- or possibly several significant parts of the code -- in pieces all over the garage floor, totally unsuitable for checkin, maybe not even compiling.
An urgent bugfix request comes in. The fix is nice and local and doesn't touch any of the code I've been working on.
I make the fix in my working copy.
Now what?
I've looked at "Mercurial cherry picking changes for commit" and "best practices in mercurial: branch vs. clone, and partial merges?" and all the suggestions seem to be extensions of varying complexity, from Record and Shelve to Queues.
The fact that there apparently isn't any core functionality for this makes me suspect that in some sense this working style is Doing It Wrong. What would a Mercurial-like solution to this use case look like?
Edited to add: git, by contrast, seems designed for this workflow: git add the bugfix files, don't git add anything else (or git reset HEAD anything you might have already added), git commit.

Here's how I would handle the case:
have a dev branch
have feature branches
have a personal branch
have a stable branch.
In your scenario, I would be committing frequently to my branch off the feature branch.
When the request came in, I would hg up -r XYZ where XYZ is the rev number that they are running, then branch a new feature branch off of that(or up branchname, whatever).
Perform work, then merge into the stable branch after the work is tested.
Switch back to my work and merge up from the top feature branch commit node, thus integrating the two streams of effort.

Lots of useful functionality for Mercurial is provided in the form of extensions -- don't be afraid to use them.
As for your question, record provides what you call partial commits (it allows you to select which hunks of changes you want to commit). On the other hand, shelve allows to temporarily make your working copy clean, while keeping the changes locally. Once you commit the bug fix, you can unshelve the changes and continue working.
The canonical way to go around this (i.e. using only core) would probably be to make a clone (note that local clones are cheap as hardlinks are created instead of copies).

You would clone the repository (i.e. create a bug-fix branch in SVN terms) and do the fix from there.
Alternatively if it really is a quick fix you can use the -I option on commit to explicitly check-in individual files.

Like any DVCS, branching is your friend. Branching a repository multiple ways is the bread and butter of these system. Here's a git model you might consider adopting that works quite well with Mercurial, also.

In addition to what Santa said about branching being your friend...
Small-granularity commits are your friend. Rather than making lots of code changes in a single commit, make each logically self-contained code change in its own commit. Then it will be a lot easier to cherry-pick changes to merge between branches.

Don't use Mercurial without using the Mq Extension (it comes pre-packaged in the default installation). In addition to solving your specific problem, it solves a lot of other general problems and really should be the default way that you work (especially if you're using an IDE that doesn't integrate directly with Hg, making switching branches on the fly a difficult way to work).

Related

Mercurial: devs work on separate folders, why do they have to merge all the time

I have four devs working in four separate source folders in a mercurial repo. Why do they have to merge all the time and pollute the repo with merge changesets? It annoys them and it annoys me.
Is there a better way to do this?
Assuming the changes really don't conflict, you can use the rebase extension in lieu of merging.
First, put this in your .hgrc file:
[extensions]
rebase =
Now, instead of merging, just do hg rebase. It will "detach" your local changesets and move them to be descendants of the public tip. You can also pass various arguments to modify what gets rebased.
Again, this is not a good idea if your developers are going to encounter physical merge conflicts, or logical conflicts (e.g. Alice changed a feature in file A at the same time as Bob altered related functionality in file B). In those cases, you should probably use a real merge in order to properly represent the relevant history. hg rebase can be easily aborted if physical conflicts are encountered, but it's a good idea to check for logical conflicts by hand, since the extension cannot detect those automatically.
Your development team are committing little and often; this is just what you want so you don't want to change that habit for the sake of a clean line of commits.
#Kevin has described using the rebase extension and I agree that can work fine. However, you'll also see all the work sequence of each developer squished together in a single line of commits. If you're working on a stable code base and just submitting quick single-commit fixes then that may be fine - if you have ongoing lines of development then you might not won't want to lose the continuity of a developer's commits.
Another option is to split your repository into smaller self-contained repositories.
If your developers are always working in 4 separate folders, perhaps the contents of these folders can be modularised and stored as separate Mercurial repositories. You could then have a separate master repository that brought all these smaller repositories together within the sub-repository framework.
Mercurial is distributed, it means that if you have a central repository, every developer also has a private repository on his/her workstation, and also a working copy of course.
So now let's suppose that they make a change and commit it, i.e., to their private repository. When they want to hg push two things can happen:
either they are the first one to push a new changeset on the central server, then no merge will be required, or
either somebody else, starting from the same version, has committed and pushed before them. We can see that there is a fork here: from the same starting point Mercurial has two different directions, thus a merge is required, even if there is no conflict, because we do not want four different divergent contexts on the central server (which by the way is possible with Mercurial, they are called heads and you can force the push without merge, but you still have the divergence, no magic, and this is probably not what you want because you want to be able to checkout the sum of all the contributions..).
Now how to avoid performing merges is quite simple: you need to tell your developers to integrate others changes before committing their own changes:
$ hg pull
$ hg update
$ hg commit -m"..."
$ hg push
When the commit is made against the latest central version, no merge should be required.
If they where working on the same code, after pull and update some running of tests would be required as well to ensure that what was working in isolation still works when other developers work have been integrated. Taking others contributions frequently and pushing our own changes also frequently is called continuous integration and ensures that integration issues are discovered quickly.
Hope it'll help.

Is better to branch or fork a Mercurial repository?

Assuming you "own" a Mercurial repository is it better to branch or fork the repository when embarking on experimental code?
In my situation I'm a lone developer and about to embark on some experimental code. I expect this experiment to take 4 to 6 hours. If I consider the experiment a failure then it highly unlikely I will ever want to refer to it again.
I commit often and push regularly. During the experiment I could only commit and then toss the local repo. However I push regularly mainly as means of backup. The repository is hosted on bitbucket.
In this situation am I better to branch or fork the repository?
Being a fan of named branches for all work (branch-per-task, etc) I would recommend ... do both.
Clone the original repo, and in the new repo create a new named branch for the experimental work.
If you decide to keep the experimental work, push it back to the original repo (perform merges as appropriate).
If you decide not to keep the experimental work, just throw the experimental repo away.
If I consider the experiment a failure
then it highly unlikely I will ever
want to refer to it again.
You answered your own question, fork it locally then pull your changes in the main repository if successful.
Though I don't think you would generate too much content in 4 to 6 hours that would make a useless branch bothersome...
I'd say it doesn't really matter. In both cases it depends on the time of deletion of your experiment whether you can later refer to it or not.
When you do experimental branches you can later get rid of them by using the strip command, which is part of the mq extensions. Or you could use convert from a hg to a hg repository in order to remove dead branches. You can even simply clone and omit branches.
When you fork for experimental branches you can obvioulsy merge them later if you want to refer to it.
Therefore it all boils down to the question: How long do you want to keep experimental branches and where do you want to store them?
Consider a branch to be documentation. You and others can refer to the ideas, and changes later. Foresee the situation where you say "Oh I tried that once in a branch... but deleted it." The only cost is a little disk space.

How to back up local Mercurial repositories and use rebase?

My company is moving from Subversion to Mercurial. One of the reasons is with Hg we hope to be able to work more independently.
We are looking forward to using rebasing as our primary way to update from the main repository, at least in the beginning, to keep the history in one line, making the transition from Subversion easier.
Right now, if we need to work independently, we have two options: create a branch in Subversion, and commit there (aka merge hell), or not commit at all. With Mercurial we hope to be able to keep committing locally, and rebase every so often, thus gaining independence while staying free of the administration costs of bcreating named branches.
This all sounds cool until backing up comes into the picture. With Subversion it was obvious that if someone didn't commit, their work ccould be lost. But not committing became inconvenient quickly (no history, no log messages etc.), so people would commit time and again nevertheless.
With Mercurial it is going to be possible to go on committing and rebasing without pushing for extended periods of time, putting much more work at risk. So a question comes up: how to back up the stuff on developers' machines?
One solution would be to use some external backup software, but that doesn't sound like a very good idea.
We could also push to the main repo all the time (maybe even automatically?), but this would make it impossible to use rebasing, and would result in a lot of dangling heads in main.
We could push to a backup repo, and try to have only one head in the main repo. This sound a lot complicated.
Are there any other ways to do this? I'd like to find a solution that would let our developers use most of their Subversion knowledge in the beginning.
Just to throw this out there: I think you're making a mistake. A linear history isn't a big deal and pull/merge is the much more normal mercurial workflow. Embrace a non-linear history and leave rebase for special occasions.
You say "With Mercurial we hope to be able to keep committing locally, and rebase every so often, thus gaining independence while staying free of the administration costs of bcreating named branches.", but in mercurial one uses unnamed branches for this, so there's no administration costs.
See http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/#branching-anonymously for an explanation of how unnamed branches created automatically will give you exactly what you want with zero hassle.
I know it sounds like too-good-to-be-true snake oil, but your folks can just hg pull and hg merge and hg push when they're through and without anyone having to think about branch names, or who has what clone, or anything, you'll have a coordinated center with disconnected work.
You could also give each developer a repo:
https://example.com/repos/awesome-product
https://example.com/repos/awesome-product-wolever
https://example.com/repos/awesome-product-pintér
Where they can push to all they want.
It would mean a lot of dangling heads, but that's probably OK because nobody would look at them until it comes time to restore… Then you'd simply ask for only the ancestors of the current tip:
hg pull https://example.com/repos/awesome-product-wolever -r tip
If I were using this scheme, I'd set:
[paths]
backup = https://example.com/repos/awesome-product-wolever
in .hgrc/hgrc.
Alternatively, you could give each dev one overall backup repo by setting this in their global ~/.hgrc:
[paths]
backup = https://example.com/repos/wolever-backup
I think you're doing it the wrong way.
What you describe looks more like you're trying to put in place a tool to ensure/enforce a process, rather than to support the process.
Working in a decentralized way is quite a big shift compare to the centralized approach, and by trying to put the kind constraint you describe will more than likely cause more hurt than good for the experience perceived by your developers.
If you're scared that they won't push their changes on a regular basis to a main repository, then it would be good to ask yourself (and to ask them too) "Are we ready to change our process?"
Many people still claims that the DVCS will bring chaos to their software factory. It's true that the DVCS'es don't enforce a linear (single-point-of-failure) way to work, but what bring chaos is not the tool, it is the team spirit you built in your company.
Now, if you HAVE TO migrate to Mercurial (because it's already sold to the management level, or whatever), but feels more comfortable with the SVN approach, with a little "plus", try using HgSubversion first. Migration will still be easy later.

How to manage concurrent development with mercurial?

This is a best practice question, and I expect the answer to be "it depends". I just hope to learn more real world scenarios and workflows.
First of all, I'm talking about different changes for the same project, so no subrepo please.
Let's say you have your code base in an hg repository. You start to work on a complicated new feature A, then a complicated bug B is reported by your trusted tester (you have testers, right?).
It's trivial if (the fix for) B depends on A. You simlply ci A then ci B.
My question is what to do when they are independent (or at least it seems now).
I can think of the following ways:
Use a separate clone for B.
Use anonymous or named branches, or bookmarks, in the same repository.
Use MQ (with B patch on top of A).
Use branched MQ (I'll explain later).
Use multiple MQ (since 1.6)
1 and 2 are covered by an excellent blog by #Steve Losh linked from a slightly related question.
The one huge advantage of 1 over the other choices is that it doesn't require any rebuild when you switch from working on one thing to the other, because the files are physically separated and independent. So it's really the only choice if, for example, A and/or B touches a header file that defines a tri-state boolean and is included by thousands of C files (don't tell me you haven't seen such a legacy code base).
3 is probably the easiest (in terms of setup and overhead), and you can flip the order of A and B if B is a small and/or urgent fix. However it can get tricky if A and B touches the same file(s). It's easy to fix patch hunks that failed to apply if A and B changes are orthogonal within the same file(s), but conceptually it's still a bit risky.
4 can make you dizzy but it's the most powerful and flexible and scalable way. I default hg qinit with -c since I want to mark work-in-progress patches and push/pull them, but it does take a conceptual leap to realize that you can branch in MQ repo too. Here are the steps (mq = hg --mq):
hg qnew bugA; make changes for A; hg qref
mq branch branchA; hg qci
hg qpop; mq up -rtip^
hg qnew bugB; make changes for B; hg qref
mq branch branchB; hg qci
To work on A again: hg qpop; mq up branchA; hg qpush
It seems crazy to take so many steps, and whenever you need to switch work you must hg qci; hg qpop; mq up <branch>; hg qpush. But consider this: you have several named release branches in the same repository, and you need to work on several projects and bug fixes at the same time for all of them (you'd better get guaranteed bonus for this kind of work). You'd get lost very soon with the other approaches.
Now my fellow hg lovers, are there other/better alternatives?
(UPDATE) qqueue almost makes #4 obsolete. See Steve Losh's elegant description here.
I would always use named branches, because that lets Mercurial do its job: to keep your project history, and to remember why you made which changes in what order to your source code. Whether to have one clone or two sitting on your disk is generally an easy one, given my working style, at least:
Does your project lack a build process, so that you can test and run things right from the source code? Then I will be tempted to have just one clone, and hg up back and forth when I need to work on another branch.
But if you have a buildout, virtualenv, or other structure that gets built, and that might diverge between the two branches, then doing an hg up then waiting for the build process to re-run can be a big pain, especially if things like setting up a sample database are involved. In that case I would definitely use two clones, one sitting at the tip of trunk, and one sitting at the tip of the emergency feature branch.
It seems like there's no more or better choices than the ones I listed in the question. So here they are again.
Use one clone per project.
Pros: total separation, thus no rebuild when switching projects.
Cons: toolchain needs to switch between two clones.
Use anonymous or named branches, or bookmarks, in the same repository.
Pros: standard hg (or any DVCS) practice; clean and clear.
Cons: must commit before switching and rebuild after.
Use MQ with one patch (or multiple consecutive patches) per project.
Pros: simple and easy.
Cons: must qrefresh before switching and rebuild after; tricky and risky if projects are not orthogonal.
Use one MQ branch (or qqueue in 1.6+) per project.
Pros: ultra flexible and scalable (for the number of concurrent projects)
Cons: must qrefresh and qcommit before switching and rebuild after; feels complicated.
Like always, there's no silver bullet, so pick and choose the one right for the job.
(UPDATE) For anyone who's in love with MQ, using MQ on top of regular branches (#2 + #3) is probably the most common and preferable practice.
If you have two concurrent projects with baseline on two branches (for example next release and current release), it's trivial to hop between them like this:
hg qnew; {coding}; hg qrefresh; {repeat}
hg qfinish -a
hg update -r <branch/bookmark/rev>
hg qimport -r <rev>; {repeat}
For the last step, qimport should add a -a option to import a line of changesets at once. I hope Meister Geisler notices this :)
So the question is, at the point when you are told to stop working on feature A, and begin independent feature B, what alternative options are there, for: How to manage concurrent development with mercurial?
Let's look at the problem with concurrency removed, the same way you write threaded code- define a simple work flow for solving any problem given to you, and apply it to each problem. Mercurial will join the work, once it's done. So, programmer A will work on feature A. Programmer B will work on feature B. Both just happen to be you. (If only we had multi-core brains:)
I would always use named branches, because that lets Mercurial do its job: to keep your project history, and to remember why you made which changes in what order to your source code.
I agree with Brandon's sentiment, but I wonder if he overlooked that feature A has not been tested? In the worst case, the code compiles and passes unit tests, but some methods implement the previous requirements, and some methods implement the new ones. A diff against the previous check-in is the tool I would use to help me get back on track with feature A.
Is your code for feature A at a point when you would normally check it in? Switching from feature A to working on feature B is not a reason to commit code to the head or to a branch. Only check in code that compiles and passes your tests. My reason is, if programmer C needs to begin feature C, a fresh checkout of this branch is no longer the best place to start. Keeping your branch heads healthy, means you can respond quickly, with more reliable bug fixes.
The goal is to have your (tested and verified) code running, so you want all your code to end up merged into the head (of your development and legacy branches). My point seems to be, I've seen branching used inefficiently: code becomes stale and then not used, the merge becomes harder than the original problem.
Only your option 1 makes sense to me. In general:
You should think your code works, before someone else sees it.
Favor the head over a branch.
Branch and check-in if someone else is picking up the problem.
Branch if your automated system or testers need your code only.
Branch if you are part of a team, working on a problem. Consider it the head, see 1-4.
With the exception of config files, the build processes should be a checkout and a single build command. It should not be any more difficult to switch between clones, than for a new programmer to join the project. (I'll admit my project needs some work here.)

Mercurial: Concrete examples of issues using hg pull --rebase

I'm struggling to find the mercurial workflow that fits the way that we work.
I'm currently favouring a clone per feature but that is quite a change in mindset moving from Subversion. We'll also have issues with the current expense we have in setting up environments.
Using hg pull --rebase seems to give us more of a Subversion-like workflow but from reading around I'm wary of using it.
I think I understand the concepts and I can see that rewriting the history is not ideal but I can't seem to come up with any scenarios which I personally would consider unacceptable.
I'd like to know what are the 'worst' scenarios that hg pull --rebase could create either theoretical or from experience. I'd like concrete examples rather than views on whether you 'should' rewrite history. Not that I'm against people having opinions, just that there already seem to be a lot of them expressed on the internet without many examples to back them up ;)
The first thing new Mercurial converts need to learn is to get comfortable committing incomplete code. Subversion taught us that you shouldn't commit broken code. Now it's time to unlearn that habit. Committing frequently gives you a lot more flexibility in your workflow.
The main problem I see with hg pull --rebase is the ability to break a merge without any way to undo. The DVCS model is based on the idea of tracking history explicitly, and rebasing subverts that idea by saying that all of my changes came after all of your changes, even though we were really working on them at the same time. And because I don't know what your changes are (because I was basing my code off of earlier changesets) it's harder for me to know that my code, on top of yours, won't break something. You also lose the branching capabilities by rebasing, which is really the whole idea behind DVCSs.
Our workflow (which we've built an entire Mercurial hosting system around) is based on keeping multiple clones, or branch repositories, as we call them. Each dev or small team has their own branch repository, which is just a clone of the "central" repository. All of my new features and large bug fixes go into my personal branch repo. I can get that code peer reviewed, and once it's deemed ready, I can merge it into the central repo.
This gives me a few nice benefits. First, I won't be breaking the build, as all of my changes are in their own repo until they're "ready". Second, I can make another branch repo if I need to do a separate feature, or if I have something longer-running, like for the next major version. And third, I can easily get a change into the central repo if there's a bug that needs to be fixed quickly.
That said, there are a couple different ways you can use this workflow. The most simple, and the one I started with, is just keeping separate clones. So I'll have website-central, website-tghw, etc. It works well, especially since you can push and pull between them locally. More recently, I've started keeping multiple heads in the same repo, using the remotebranches extension to help manage them and hg nudge to keep from pushing everything at once.
Of course, some people don't like this workflow as much, usually because their Mercurial server makes it hard to make server-side clones. In that case, you can also look at using named branches to help keep your features straight. Unfortunately, they're not quite as flexible as Git branches (which is why we prefer branch repos) but they work well once you understand how to close branches, and why you can't really get rid of them once you start one.
This is getting a bit long, so I'll wrap it up by encouraging you to embrace the superior branching and merging that Mercurial provides (over SVN). There is definitely a learning curve, but once you get the hang of it, it really does make things easier.
From the question comments, your root issue is that you have developers working on several features/bug fixes/issues at one time and having uncommitted work in their working directory along with some completed work that is ready to be pushed back to the central repository.
There's a really nice exchange that covers the issue well and leads on to a number of ways forward.
http://thread.gmane.org/gmane.comp.version-control.mercurial.general/19704
There are ways you can get around keeping your uncommitted changes, e.g. by having a separate clone to handle merges, but my advice would be to embrace the distributed way of working and commit as often as you like - if you really feel the need you can combine the last few local commits into a single changeset (using MQ, for example) before pushing.