Branching with Mercurial SCM

Branching with Mercurial SCM - mercurial

So right now I'm learning Ruby on Rails, and I'm working through the book "Agile Web Development with Rails". I've also decided that I want to give Mercurial a go, because I've read up on distributed SCM's, and it seems like an ideal situation. I still, however, prefer to push my code remotely to my Linux VPS just incase my hard drive decides to take a dive.
So, my question is specific to branching in Mercurial. Right now I've got a remote repository set up and I can push changes over SSH easily (hell I even set up an Nginx FastCGI site that lets me push, too). What I'd like to do, however, is create branches for each chapter as I work on them, so I can keep a nice organized history of my progress through the book. So this is what I'm doing:
$ hg branch chapter-10
(do chapter 10 stuff)
$ hg commit -m "Chapter 10 complete"
$ hg update default
$ hg merge chapter-10
$ hg commit -m "Merging chapter 10 into default"
$ hg push
Once I execute the push statement, I get this message from Mercurial:
pushing to ssh://myserver/hg/depot
searching for changes
abort: push creates new remote branch 'chapter-10'!
(did you forget to merge? use push -f to force)
So at this point I try to do an hg merge again, and it tells me there's nothing to merge, which is obviously true because I just merged it. When I force the push with -f, everything seems fine, and even the web interface shows the appropriate branches.
To sum up, my question is simple: Am I doing this the right way? Is there a more appropriate way to do this with Mercurial (i.e. the "Mercurial way")? Honestly I just want the repository to serve as a backup. I'm a fan of the distributed SCM model, but to me it feels sorta "dirty" to force pushes. Any insight is greatly appreciated! Thanks in advance.

The push -f is the right option for your case, and there was a discussion last month to add that command when this "push creates new remote branch" warning pops up: see issue 1513.
However, issue 1974 (this month) mentions some undesirable effects (not in your case though).
See this translated article to know more about creating a second head on a remote repo.
On the more general point, you can use branch if you are writing your chapter in parallel, and you want to merge them only at certain (stable) point in time
But if your writing process is more linear, you could use only one branch, and put some tags along the way.
However, should you go back to chapter 10 and add some lines, even though you already put tags 11 and 12, that would make the history harder to read. So branches are still a good idea in this case.

I don't know about your specific problem, but from your comments it seems that you use branches where you probably wanted to use tags.
Branches are generally used when multiple people cooperate on the same project and you want to create a work separation so one person can work on a stable piece of code, while the other does something experimental that temporarily breaks functionality. Alternatively branches are used to stabilize for release, while development is going on in trunk.
Tags (or labels) are used to primarily create a marker signifying some importance to the version of code. Like for example if you want to mark a completion of chapter 10, you just tag all current versions with a 'chapter-10' tag. There is no need to branch. You can branch from a tagged version at any point in future if it would be necessary for some reason.

In this case I feel that it's totally ok to use -f for the push. It just creates new branches, not heads. Creating remote heads is another matter entirely.

Related

Mercurial: devs work on separate folders, why do they have to merge all the time

I have four devs working in four separate source folders in a mercurial repo. Why do they have to merge all the time and pollute the repo with merge changesets? It annoys them and it annoys me.
Is there a better way to do this?

Assuming the changes really don't conflict, you can use the rebase extension in lieu of merging.
First, put this in your .hgrc file:
[extensions]
rebase =
Now, instead of merging, just do hg rebase. It will "detach" your local changesets and move them to be descendants of the public tip. You can also pass various arguments to modify what gets rebased.
Again, this is not a good idea if your developers are going to encounter physical merge conflicts, or logical conflicts (e.g. Alice changed a feature in file A at the same time as Bob altered related functionality in file B). In those cases, you should probably use a real merge in order to properly represent the relevant history. hg rebase can be easily aborted if physical conflicts are encountered, but it's a good idea to check for logical conflicts by hand, since the extension cannot detect those automatically.

Your development team are committing little and often; this is just what you want so you don't want to change that habit for the sake of a clean line of commits.
#Kevin has described using the rebase extension and I agree that can work fine. However, you'll also see all the work sequence of each developer squished together in a single line of commits. If you're working on a stable code base and just submitting quick single-commit fixes then that may be fine - if you have ongoing lines of development then you might not won't want to lose the continuity of a developer's commits.
Another option is to split your repository into smaller self-contained repositories.
If your developers are always working in 4 separate folders, perhaps the contents of these folders can be modularised and stored as separate Mercurial repositories. You could then have a separate master repository that brought all these smaller repositories together within the sub-repository framework.

Mercurial is distributed, it means that if you have a central repository, every developer also has a private repository on his/her workstation, and also a working copy of course.
So now let's suppose that they make a change and commit it, i.e., to their private repository. When they want to hg push two things can happen:
either they are the first one to push a new changeset on the central server, then no merge will be required, or
either somebody else, starting from the same version, has committed and pushed before them. We can see that there is a fork here: from the same starting point Mercurial has two different directions, thus a merge is required, even if there is no conflict, because we do not want four different divergent contexts on the central server (which by the way is possible with Mercurial, they are called heads and you can force the push without merge, but you still have the divergence, no magic, and this is probably not what you want because you want to be able to checkout the sum of all the contributions..).
Now how to avoid performing merges is quite simple: you need to tell your developers to integrate others changes before committing their own changes:
$ hg pull
$ hg update
$ hg commit -m"..."
$ hg push
When the commit is made against the latest central version, no merge should be required.
If they where working on the same code, after pull and update some running of tests would be required as well to ensure that what was working in isolation still works when other developers work have been integrated. Taking others contributions frequently and pushing our own changes also frequently is called continuous integration and ensures that integration issues are discovered quickly.
Hope it'll help.

Can't push new heads - fail to see how they would be created

I'm fairly new to version control in teams. So far I've mostly used it solo.
I've read that the following workflow is recommended:
Commit locally, pull master, merge master into my branch, merge my
branch into master, push. Several times a week or even day
So that's what I tried to do. However, when I was done with my feature, and tried to push, tortoise hg told me, that this would create new remote heads.
hg help push tells me about two options:
Merge first: Did that
Use -f: I know enough not to do that.
I think I understand the concept of rebasing - which I don't think applies here, since I'm the only one who did anything in this commit tree. Of course I've pulled.
So my question is: How can I resolve this specific situation?
Also, recommendations for where to learn proper version control workflow would be nice. Everything I find tells me what the commands are, but I've failed to find clear instructions on when to use them.
I've added a picture of the project. Commit 147 was mine, and I could push it just fine. All oher commits are also made by me.

hg reports a "head" for every named branch. In your screenshot, you are needing to push rev 154, which is the head of your kjeld branch. It is an outgoing changeset because you are pushing rev 155 and you must therefore push 155's entire history as well. Others will get that branch when they pull your changes and will have a head on their version of kjeld (note that it will most likely not be numbered 154 since those numbers are repo specific). You will be fine though since that head is a close-branch changeset so it will not appear in their default list for hg heads and hg branches.
One way to avoid your current issue is to use bookmarks to temporarily note what that head represents e.g. issue-45, big-feature-2, etc. and only push when merged into mainline development.
For us, we set up a "private" repo for each dev on the server where they store/backup work in progress. It is expected that there are multiple heads, dead branches, and other gunk in these "private" repos. The dev repo, however, only ever has a single head and must pass the build and build tests.
In response to your comment about your "private" branch: When you push your tip you will also push your branch named kjeld. Others who want to work on that code must pull it to get the tip of your development. It will not be a "private" branch.

Close an unmerged wasteful branch in mercurial

I decide to start an experiment in a branch
[default] $ hg branch experiment
[experiment] $ [... some commits ...]
Aargh! does not work! I want to throw it away.
[experiment] $ hg commit -m "did not work; closing ..." --close-branch
[experiment] $ hg update default
To get the real tip back -
[default] $ [... some commits ...]
[default] $ hg push
Is this a correct workflow to destroy an experimental branch?

You've got two fine answers on how to undo your branch, but the bigger point is don't use named branches for temporary concepts. Named branches are for long lived entities like 'development' and 'stable'. For features, expiriments, etc. you want either clones, bookmarks, or anonymous branches. All three are contrasted with named branches in this excellent article by Steve Losh:
http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/
You can see similar advice from the Mercurial project here:
https://www.mercurial-scm.org/wiki/StandardBranching

The Mercurial wiki covers all the options for Pruning Dead Branches. Briefly, these options include:
Closing the branch (as done in your original post)
Create a new clone that does not include the dead branch
Use a no-op merge
Use the strip command that is bundled with the mq extension

Closing a branch will leave it in the repository, and the closed branch will be pushed with other changesets next time you do a push.
If you don't want this to happen, and your branch is local, just strip it.
On the other hand, if you have already pushed the experimental branch, stripping it won't help, so you can either close it or do a dummy merge (or both).

In my opinion, you should just close the branch and forget about it.
In the long run, there's no harm in a "dead" branch being present in the repository. Any given branch is almost certainly tiny in comparison to the contents of your repository and any additional "noise" created by the additional changesets is going to fade into the past relatively quickly.
However, by not worrying about cleaning up the branch, you achieve two things:
You don't have to deal with any of the potential issues associated with altering history in a DVCS.
(More importantly) You have a permanent record of your attempt.
That second point is key -- you can actually make use of what you learned if the branch is still around: any fellow developers can learn from it; you can go back and try again if you learn something else; you can prevent trying the same thing again by seeing this branch in history.
A lot of developers have a hard time with keeping history that isn't "pristine" in their DVCS, especially when they recently came from a centralized VCS.* Over time, I've come to realize that there's nothing bad or wrong about that "other" history and in fact it can turn out to be remarkably useful if kept around.
*I'm not necessarily implying that you fall into either of these camps, just making an observation.

Is there any way of deleting a branch that has been pushed to a repo shared by all developers?

I have accidentally pushed a branch to a repo. Is there anyway I could alter the repo ( and remove the branch )? Closing it is not a solution.

You got a couple of options, none of them easy, and none of them will leave you with a "phew, saved by the bell" feeling afterwards.
The only real way to fix this problem is to try to avoid it in the first place.
Having said that, let's explore the options here:
Eradicate the changesets
Introduce further changesets that "undo" the changes
The first option, to eradicate the changesets, is hard. Since you pushed the changesets to your central repository, you need direct access to the repositories on that server.
If this is a server where you don't have direct access to the repositories, only through a web interface, or through push/pull/clone, then your option is to hope that the web interface have methods for eradicating those changesets, otherwise go to option 2.
In order to get rid of the changesets, you can either make a new clone of the repository with the changesets, and specify options that stop just shy of introducing the changesets you want to get rid of, or you can use the MQ extension and strip the offending changesets out.
Either is good, but personally I like the clone option.
However, this option hinges on the fact that any and all developers that are using the central repository either:
Have not already pulled the offending changesets from the central repository.
Or are prepared to get rid of said changesets locally as well.
For instance, you could instruct all your developers to kill their local clones, and reclone a fresh copy after you have stripped away the changesets in the central repository.
Here's the important part:
If you cannot get all developers to help with this, you should drop this line of thought and go to option 2 instead
Why? Because now you have two problems:
You need to introduce barriers that ensure no developers can push the same changesets onto the server again, after you got rid of them. Note that relying on the warning by the server to prevent new branches being pushed is perhaps not good enough, as developers might have branches of their own they want to push, and thus not notice that they'll be pushing yours as well.
Any work any developer has done based on any of the offending changesets must either be rebased to a new place, or eradicated as well.
In short, this will give you lots of extra work. I would not do this unless the offending changesets were super-critial to get rid of.
Option 2, on the other hand, comes with its own problems, but is a bit easier to carry out.
Basically you use the hg backout command to introduce a new changeset that reverses the modifications done by the offending changesets, and commit and push that.
The problem here is that if at some point you really want to introduce those changesets, you will have to fight a bit with Mercurial in order to get the merges right.
However, there will be no more work for your fellow developers. The next time they pull, they'll get your correction changeset as well.
Let me just restate this option in different words:
Instead of getting rid of the changesets, keep them, but introduce another changeset that reverses the changes.
Neither option is good, both will generate some extra work.

We've ran into a similar problem once, when we had to remove a branch from the server repo from which all devs regularly pull. Backout wasn't an option because the problematic branch had already been pulled by everyone.
We stripped (hg strip from the MQ extension) the branch in the server repo. From now on, if a developer tried to push, he had a message “push creates new remote branches”, even though they didn't actually created any. We created a batch file with the strip command, distributed it among the devs and explained the “new remote branches” is a signal to run the batch file.
This approach takes some time and effort before everybody gets rid of the branch, but it works.

If the 'backout' option described in Jason's comment above doesn't do it for you, you can remake the repo up until the point of your mistaken push using hg convert, which (despite its name) also works with hg.
eg hg convert -r before-mistaken-push /path/to/original /path/to/new
You might have to play with the usebranchnames and clonebranches settings.

How to manage concurrent development with mercurial?

This is a best practice question, and I expect the answer to be "it depends". I just hope to learn more real world scenarios and workflows.
First of all, I'm talking about different changes for the same project, so no subrepo please.
Let's say you have your code base in an hg repository. You start to work on a complicated new feature A, then a complicated bug B is reported by your trusted tester (you have testers, right?).
It's trivial if (the fix for) B depends on A. You simlply ci A then ci B.
My question is what to do when they are independent (or at least it seems now).
I can think of the following ways:
Use a separate clone for B.
Use anonymous or named branches, or bookmarks, in the same repository.
Use MQ (with B patch on top of A).
Use branched MQ (I'll explain later).
Use multiple MQ (since 1.6)
1 and 2 are covered by an excellent blog by #Steve Losh linked from a slightly related question.
The one huge advantage of 1 over the other choices is that it doesn't require any rebuild when you switch from working on one thing to the other, because the files are physically separated and independent. So it's really the only choice if, for example, A and/or B touches a header file that defines a tri-state boolean and is included by thousands of C files (don't tell me you haven't seen such a legacy code base).
3 is probably the easiest (in terms of setup and overhead), and you can flip the order of A and B if B is a small and/or urgent fix. However it can get tricky if A and B touches the same file(s). It's easy to fix patch hunks that failed to apply if A and B changes are orthogonal within the same file(s), but conceptually it's still a bit risky.
4 can make you dizzy but it's the most powerful and flexible and scalable way. I default hg qinit with -c since I want to mark work-in-progress patches and push/pull them, but it does take a conceptual leap to realize that you can branch in MQ repo too. Here are the steps (mq = hg --mq):
hg qnew bugA; make changes for A; hg qref
mq branch branchA; hg qci
hg qpop; mq up -rtip^
hg qnew bugB; make changes for B; hg qref
mq branch branchB; hg qci
To work on A again: hg qpop; mq up branchA; hg qpush
It seems crazy to take so many steps, and whenever you need to switch work you must hg qci; hg qpop; mq up <branch>; hg qpush. But consider this: you have several named release branches in the same repository, and you need to work on several projects and bug fixes at the same time for all of them (you'd better get guaranteed bonus for this kind of work). You'd get lost very soon with the other approaches.
Now my fellow hg lovers, are there other/better alternatives?
(UPDATE) qqueue almost makes #4 obsolete. See Steve Losh's elegant description here.

I would always use named branches, because that lets Mercurial do its job: to keep your project history, and to remember why you made which changes in what order to your source code. Whether to have one clone or two sitting on your disk is generally an easy one, given my working style, at least:
Does your project lack a build process, so that you can test and run things right from the source code? Then I will be tempted to have just one clone, and hg up back and forth when I need to work on another branch.
But if you have a buildout, virtualenv, or other structure that gets built, and that might diverge between the two branches, then doing an hg up then waiting for the build process to re-run can be a big pain, especially if things like setting up a sample database are involved. In that case I would definitely use two clones, one sitting at the tip of trunk, and one sitting at the tip of the emergency feature branch.

It seems like there's no more or better choices than the ones I listed in the question. So here they are again.
Use one clone per project.
Pros: total separation, thus no rebuild when switching projects.
Cons: toolchain needs to switch between two clones.
Use anonymous or named branches, or bookmarks, in the same repository.
Pros: standard hg (or any DVCS) practice; clean and clear.
Cons: must commit before switching and rebuild after.
Use MQ with one patch (or multiple consecutive patches) per project.
Pros: simple and easy.
Cons: must qrefresh before switching and rebuild after; tricky and risky if projects are not orthogonal.
Use one MQ branch (or qqueue in 1.6+) per project.
Pros: ultra flexible and scalable (for the number of concurrent projects)
Cons: must qrefresh and qcommit before switching and rebuild after; feels complicated.
Like always, there's no silver bullet, so pick and choose the one right for the job.
(UPDATE) For anyone who's in love with MQ, using MQ on top of regular branches (#2 + #3) is probably the most common and preferable practice.
If you have two concurrent projects with baseline on two branches (for example next release and current release), it's trivial to hop between them like this:
hg qnew; {coding}; hg qrefresh; {repeat}
hg qfinish -a
hg update -r <branch/bookmark/rev>
hg qimport -r <rev>; {repeat}
For the last step, qimport should add a -a option to import a line of changesets at once. I hope Meister Geisler notices this :)

So the question is, at the point when you are told to stop working on feature A, and begin independent feature B, what alternative options are there, for: How to manage concurrent development with mercurial?
Let's look at the problem with concurrency removed, the same way you write threaded code- define a simple work flow for solving any problem given to you, and apply it to each problem. Mercurial will join the work, once it's done. So, programmer A will work on feature A. Programmer B will work on feature B. Both just happen to be you. (If only we had multi-core brains:)
I would always use named branches, because that lets Mercurial do its job: to keep your project history, and to remember why you made which changes in what order to your source code.
I agree with Brandon's sentiment, but I wonder if he overlooked that feature A has not been tested? In the worst case, the code compiles and passes unit tests, but some methods implement the previous requirements, and some methods implement the new ones. A diff against the previous check-in is the tool I would use to help me get back on track with feature A.
Is your code for feature A at a point when you would normally check it in? Switching from feature A to working on feature B is not a reason to commit code to the head or to a branch. Only check in code that compiles and passes your tests. My reason is, if programmer C needs to begin feature C, a fresh checkout of this branch is no longer the best place to start. Keeping your branch heads healthy, means you can respond quickly, with more reliable bug fixes.
The goal is to have your (tested and verified) code running, so you want all your code to end up merged into the head (of your development and legacy branches). My point seems to be, I've seen branching used inefficiently: code becomes stale and then not used, the merge becomes harder than the original problem.
Only your option 1 makes sense to me. In general:
You should think your code works, before someone else sees it.
Favor the head over a branch.
Branch and check-in if someone else is picking up the problem.
Branch if your automated system or testers need your code only.
Branch if you are part of a team, working on a problem. Consider it the head, see 1-4.
With the exception of config files, the build processes should be a checkout and a single build command. It should not be any more difficult to switch between clones, than for a new programmer to join the project. (I'll admit my project needs some work here.)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008