hg: fix branching without merges

hg: fix branching without merges - mercurial

I have repository with wrong branching. Branches was used like tag for commit message, showing related part of project (eg data, search - like features). No merging was used. Each next commit, if it is about other feature than prev, just reopens branch with desired name.
It looks like
o changeset: 717
| branch: default
|
o changeset: 523
| branch: search
|
o changeset: 357
| branch: data
|
o changeset: 397
| branch: data
|
o changeset: 789
| branch: default
Whats the right way to stop that ugliness?
Update to each branch-head and merge with last commit consequentially? But there nothing to merge actually.
Or update to each head, commit with "close branch", update to next head ... and at last update to default?

Firstly, as Aaron pointed out, and assuming these are misunderstandings by the commiters, you will need to teach the others working on that repo when its appropriate ( by your definition of appropriate ) to create named branches, and when it isn't. I imagine that would solve most of your problems.
If you want to bring everything back in line, and the branches are currently open, you can close each branch and then merge them into default. If you don't merge after closing, you will have a 'dangling' commit in your history graph, but that may or may not matter to you. Also, if you only have 1 'active' branch, which you can check with
hg branches
then all the commits are currently merged into that active branch, and you may only need to merge that branch into default to bring things in line. However you may still want to close the inactive branches, which will leave you with dangling commits if you don't merge them after closing.
If you want to avoid having commits with branchnames, you could look at using bookmarks, which are sort of an amalgamation of branching and tagging.
If you want to minimise branching ( even anonymous branching ) on your repo, you could use rebase, which is a built in extension that you have to enable. You can use the rebase switch when pulling someone else's changes, to automatically move your commits after the commits you are pulling in and make the history linear. You cannot do this once your commits have already been pulled or pushed elsewhere however ( well, you could, but it would make things very complicated depending on what happens with your pulled commits ). However using rebase would depend on how you feel about keeping accurate history.
To clean up the existing repo could be a bit of a mission. You could use rebase to move all the commits to a common branch, such as default, however since it changes your history it will no longer be 'compatible' with the existing clones of that repo. So you would have to use rebase to alter your history, and then have everyone ( including servers ) reclone the altered repo and trash the old one ( they cannot even pull any outstand commits from the old one to the altered repo ).

From the look of the history (just one line), people really just gave each commit a different branch name without actually ever doing any work on two different branches at the same time.
While I haven't seen this before, it's a interesting and clean way to sort changesets into features without creating an ugly history.
As for "stopping" this, just stop reopen branches. There is nothing that you need to do to "clean" the history since there aren't actually any heads which need merging. Therefore, Mercurial will tell you "nothing to merge" if you check out default and say hg merge search.

Related

Splitting out multiple intermediate changes in Mercurial

We have a very unfortunate situation where a new feature branch was made from a second, unrelated feature branch that had been put on hold rather than from default. There's multiple changesets within both feature branches and while most changed files are unrelated several high level project files that have edits on both branches. Default has also had several updates and merges during this time too. Thankfully the intermediate feature hasn't been updated concurrently with the new feature. I've looked into various commands and options available and lost how to best fix this situation. The tree currently looks roughly like:
Default -- Various edits and merges -- tip
\
\-- Named Branch 1 (15 changes) -- Named Branch 2 (30 edits)
I want to get to a point where default has the changes from Named Branch 2 but none from Named Branch 1. I also want a branch that has the changes from Named Branch 1 available for when we return to this feature.
I suspect there's no nice easy way to do this, and there's going to be some messy parts in the history, however I am at a loss at how to start going about this task.

hg graft can cherry-pick changesets from one branch to another. Update to the destination branch, then graft the revisions you want to copy by running, for example:
hg graft 8 9 10
Conflicts will be handled using the normal merge process.
If you use TortoiseHg, select the changesets to graft to the current selected changeset, then right-click and select Graft Selected to local...:
Result:

Since you want to move the entire branch 2, you could consider using rebase. Make a clone of the repository and try this:
hg rebase --source <first branch2 rev> --dest <new parent in default> --keepbranches
This will in principle transform the history to what it should have been. However:
You may have to resolve conflicts arising when <first branch2 ver> gets moved to a new parent.
Since rebase rewrites history, you'll have to get everyone to cooperate in synchronizing their repositories. Whether that's feasible or worth the trouble in your case I can't say, but it's not that difficult: Assuming everyone has pushed any changes in branch 2, they can pull the new history and then get rid of the obsolete version of branch 2 with hg strip:
hg strip <first branch2 rev>

Why do I sometimes but not always end up with two heads on a branch in Mecurial?

I sometimes but not always end up with two heads on my default branch. The sequence of events is something like:
hg pull
hg update -C default
hg branch mybranch
hg merge default //merge in default
hg commit -m"merged mybranch into default"
hg heads -default //shows 2 heads
hg push --branch default //won't let ne. create 2 heads
The 'rival' head appears to be a changeset committed to default earlier in the day.
What I don't understand is why this happens sometimes and not other times.
The explanation I am usually offered is that the other guy pushed a change after I did a pull (my first action in the list above). But I think this happens in other cases e.g. when he pushed his changeset before I started.
I would have thought that when I pull default with his commit I get default with one head. My merge/commit should just create a new head after that. Why does it create a second head?

First of all, this is a totally normal situation. It's not a problem, or an error, or something to be avoided -- it's how a DVCS works.
In brief: you get two heads whenever two people start working from the same commit and do different things. It doesn't matter if they do it in their own (named) branch (as you're doing above) or on default. Once you merge your work back to default and someone else has done work on default you've got two heads and someone has to merge. It's just how things are.
Before you push do a hg pull and then a hg merge and you'll integrate your work with yours, by creating a new merge commit, with two parents -- your work and their work -- and then you push and you'll not see that warning.
Incidentally, you might want to check out the bookmarks feature. It's better suited for per-feature branches like you appear to be doing above than the named branches you're using, but it in no way saves you from having to deal with multiple heads.

check state of repository immediately after pull (pull can bring new head in default for you) with hg heads
hg branch without commit do nothing (you haven't mybranch)
Even if you'll create mybranch and merge it with default's tip (according to your syntax), it will not eliminate 2-nd head in default

Mercurial moving commits in another branch

My coworker accidentally made two commits in the default branch instead of creating new his own development branch.
How can I change this situation and moves these two commits to a new branch?

Imagine the following scenario:
D
|
C
| "I want to move C and D here"
B/
|
A
Steps:
hg update B
hg branch "mybranch"
hg commit --message "Create my branch"
hg update mybranch
hg graft -r C
hg graft -r D
hg strip -r C (this should be the revision C had originally)
The strip command is provided by an extension that you need to enable. You can follow a guide on how to enable it on the Mercurial Wiki.
hg update default

A major question
Have the accidental commits reached other repositories or is it just in his own? If so, you can skip to the section below 'Maybe the cat is still in the bag' otherwise you may have a fair bit of work to do.
You are not alone
See here for more discussion on how to correct the problem elsewhere on Stack Overflow. What is described is the 'proper' way to to it
export a patch
create the branch
import the patch
delete the earlier commits.
Maybe the cat is still in the bag
If the changes are only in the local copy, then a simpler solution is to
create the new branch
switch to it
merge the changes onto that either with your fav merge tool (go Meld) or with hg graft
use the hg strip command to delete the changes on the old brach
push the changes to the world
pretend like nothing ever happened, whistle a happy tune ...

The two answers above are both correct but, assuming one has not yet pushed the commits, there's a third way.
I just successfully used the rebase command to move a string of commits to a topic branch I had forgotten to create in the first place.
I first updated to the revision from which I wanted to create the branch on which my commmits were supposed to be, then I rebased the earliest of my commits from the wrong branch on this new one and ta-da, done.
Takes more time to explain it than to do it with TortoiseHg or even the command line, really.

Mercurial workflow to deal with sudden high priority tasks

The scenario is that I arrive at work one morning and pull, update, merge my mercurial tree. From the now latest revision I begin working on todays task. I reach a logical milestone and does a commit. The tree now looks like this:
* <- my first commit for the day
|
* <- merge commit from shared repos by team
Now the boss comes along, something horrible has happened and it needs my immediate attention. He wants the solution pushed asap.
My plan for the day is disrupted. What is the best way to use mercurial to tackle this problem?
I could just write the solution and make another commit followed by a push. But this is bad since that commit would have my own first commit as parent, thus pushing incomplete code.
In a perfect world maybe that new feature Im coding will be in its own branch. But its a small feature and Im lazy, so it currently resides in default. I could try to find a way to move a commit from one branch to another.
I could ignore the panicking boss, do two or three more commits to complete the feature, push that and then start working on the fix, bringing those in as a separate commit and push.
Neither of those feels good. I would like something along the lines of:
Do an update, bringing me back to the merge commit.
Do the fix and commit it.
Push only the fix commit, leaving the incomplete-feature commit still unpushed.
Thus making the history like this:
* <- fix commit (pushed)
|
| * <- my first commit for the day (unpushed)
|/
* <- merge commit from shared repos by team
We've just migrated to mercurial here at the office, and this is one problem I would like to tackle. Any mercurial gurus here that mind sharing some wisdom?

The workflow you're describing looks good (update to merge, commit, push -r the fix). If you feel not comfortable with anonymous branches, you can first clone -r the repo up to the merge, commit and push.

How does Mercurial work with many developers?

I look at Mercurial repositories of some known products, like TortoiseHg and Python, and even though I can see multiple people committing changes, the timeline always looks pretty clean, with just one branch moving forward.
However, let's say you have 14 people working on the same product, won't this quickly get into a branch nightmare with 14 parallel branches at any given time?
For instance, with just two people, and the product at changeset X, now both developers start working on separate features on monday morning, so both start with the same parent changeset.
When they commit, we now have two branches, and then with 14 people, we would quickly have 10+ (might not be 14...) branches that needs to be merged back into the default.
Or... What am I not seeing here? Perhaps it's not really a problem?
Edit: I see there's some confusion as to what I'm really asking about here, so let me clarify.
I know full and well that Mercurial easily handles multiple branches and merging, and as one answer states, even when people work on the same files, they don't often work on the same lines, and even then, a conflict is easily handled. I also know that if two people end up creating a merge hell because they changed a lot of the same code in the same files, there's some overall planning failure here, since we've placed two features in the exact same place onto two developers, instead of perhaps trying them to work together, or just giving both to one developer in the first place.
So that's not it.
What I'm curious about is how these open source project manage such a clean history. It's not important to me (as one comment wondered) that the history is clean, I mean, we do work in parallel, that the repository is able to reflect that, so much the better (in my opinion), however these repositories I've looked at doesn't have that. They seem to be working along the Subversion model where you can't commit before you've updated and merged, in which case the history is just one straight line.
So how do they do it?
Are they "rebasing" the changes so that they appear to be following the latest tip of the branch even though they were originally committed a bit back in the branch history? Transplanting changesets to make them appear to' having been committed in the main branch to begin with?
Or are the projects I've looked at either so slow (at the moment, I didn't look far back in the history) at adding new things that in reality they've only been working one person at a time?
Or are they pushing changes to one central maintainer who reviews and then integrates? It doesn't look like that since many of the projects I looked at had different names on the changesets.

Or... What am I not seeing here?
Perhaps it's not really a problem?
It's not really a problem. In a large project even when people work on the same feature, they don't usually work on the same file. When they work on the same file, they don't usually modify the same lines. And when they modify the same lines, then a merge should be done manually (for the affected lines).
This means in practice that 80+% of the merges can be done automagically by Mercurial itself.
Let's take an example:
you have:
[branch 1] [branch2]
\ /
\ /
[base]
Edit: for clarity, by branch I refer here to unnamed branches.
If you have a file changed in branch 1 but the same file in branch 2 is the same as in base, then the version in branch 1 is chosen. If the file is modified in both branch 1 and branch 2 the files are merged line by line using the same algorithm: if line 1 in file1 in branch 1 is different than line 1 in file1 in base but branch 2 and base have the line 1 equal, line 1 in branch 1 is chosen (and so on and so forth).
For the lines that are modified in both branches, Mercurial interrupts the automated merging process and prompts the user to choose which lines to use, or edit the lines manually.
Since deciding which lines to use is best done by the person(s) who modified those lines, a good practice is to have the person that implemented a feature perform the merge. That means that if me and you work on the same project, I implement my feature, then make a pull from a central/common repository (get the latest version that everyone uses), then merge my new version with the pulled changes, then publish it to the common repository (at this point, the common repository has one main branch, with my merged changes into it). Then, you pull that from the server and do the same with your changes.
This implies that everyone is capable of doing whatever they want in their local repository, and the common/official repository has one branch. It also means that you need to decide on a time frame when people should merge their changes in.
I used to have three or four repositories on my machine already compiled on different product versions (different branches of the repository) and a few different branches in my main repository (one for refactoring, one for development and so on). Whenever I would bring one branch to a stable state (say - finish a refactoring) I would pull from the server, merge that branch into the pulled changes, then push it back to the server and let anyone know that if they made any changes to the affected files, they should pull first from the server.
We used to synchronize implemented features every Monday morning and it took us about an hour to merge everything, then make a weekly build on the server to give to QA (on bad days it would take two member of the team two hours or so, then everyone would pull the week's changes on their machine and use them as a new base for the week). This was for an eight-developers team.

In your updated question it seems that you are more interested in ways of tidying up the history. When you have a history and want to make it into a single, neat, straight line you want to use rebase, transplant and/or mercurial queues. Check the docs out for those three and you should realise the workflow for how its done.
Edit: Since Im waiting for a compile, here follows a specific example of what I mean:
> hg init
> echo test > a.txt
> hg addremove && hg commit -m "added a.txt"
> echo test > b.txt
> hg addremove && hg commit -m "added b.txt"
> hg update 0 # go back to initial revision
> echo test > c.txt
> hg addremove && hg commit -m "added c.txt"
Running hg glog now shows this (diverging) history with two branches:
# changeset: 2:c79893255a0f
| tag: tip
| parent: 0:7e1679006144
| user: mizipzor
| date: Mon Jul 05 12:20:37 2010 +0200
| summary: added c.txt
|
| o changeset: 1:74f6483b38f4
|/ user: mizipzor
| date: Mon Jul 05 12:20:07 2010 +0200
| summary: added b.txt
|
o changeset: 0:7e1679006144
user: mizipzor
date: Mon Jul 05 12:19:41 2010 +0200
summary: added a.txt
Do a rebase, making changeset 1 into a child of 2 rather than 0:
> hg rebase -s 1 -d 2
Now lets check history again:
# changeset: 2:ea0c9a705a70
| tag: tip
| user: mizipzor
| date: Mon Jul 05 12:20:07 2010 +0200
| summary: added b.txt
|
o changeset: 1:c79893255a0f
| user: mizipzor
| date: Mon Jul 05 12:20:37 2010 +0200
| summary: added c.txt
|
o changeset: 0:7e1679006144
user: mizipzor
date: Mon Jul 05 12:19:41 2010 +0200
summary: added a.txt
Presto! Single line. :)
Also note that I didnt do a merge. When you rebase like this, you will have to deal with merge conflicts and everything just like as if you did a merge. Because thats pretty much what happens under the hood. Experiment with this in a small test repo. For example, try changing the file added in revision 0 rather than just adding more files.

I'm a Mercurial developer, so let me explain how we/I do it.
In the Mercurial project we accept contributions in form of patches sent to the mailinglist. When we apply those with hg import, we do an implicit rebase to the tip of the branch we are working on. This help a lot with keeping the history clean.
As for my own changes, I use rebase or mq to linearize things before I push them, again to keep the history tidy. It's basically a matter of doing
hg push # abort: creates new remote head
hg pull
hg rebase
hg push
You can combine the pull and rebase if you like (hg pull --rebase) but I've always liked to take one step at a time.
By the way, there are some disagreements about this practice of linearizing the history -- some believe that the history should show how things really happened, with all the branches and merges and whatnot. I find that as long as you don't mess with public changesets, then it's okay and useful to linearize history.

The Linux kernel is stored in thousands of repositories and probably millions of branches, and this doesn't seem to pose a problem. For large projects you need a repository strategy (e.g., the dictator–lieutenants strategy), but having many branches is the main strength of the modern DVCSes and not a problem at all.

Yes, we'll have to merge and to avoid heads on the main repository, merging should be done on the child repositories by the developer.
So before you push your code to the parent repository you first pull the latest changes, merge on your side and (try to) push. This should avoid unwanted heads in the master repo

I don't know how the TortoiseHg team does things, but you can use Mercurial's rebase extension to "detach" a branch and drop it on the top of the tip, creating a single branch.
In practice, though, I don't get concerned about multiple branches, as long as I don't see more heads than there should be. Merging is not really a big deal.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008