How to auto-merge 2 heads with mercurial

How to auto-merge 2 heads with mercurial - mercurial

We just changed over to mercurial from subversion and there is one thing that is taking up more time than expected; merging heads.
We love the fact that it keeps merges independent from the 2 commits (compared to subversion) but we end up on a regular basis merging 2 heads for unrelated changes.
Simple scenario.
Both me and Bob are up to date.
We both have ou repo up to date on default (aka main) branch and do improvement in different files.
We commit and only one will be able to push to the central server, the other one will create 2 heads. Then, pull, select 2 heads, merge (it will go easily since changes are on different files). Commit, then push.
Therefore, is there an extension that does these steps
Attempt merge
If no conflicts
Commit
else
Cancel merge
We are looking to have this run on an automated server, so +1 it this is command line and another +1 if it can do the merge without touching the working copy.
Thanks!
Update:
We ended up doing a few python scripts to manage the most common tasks (merge up & build; merge 2 heads).
Thanks for the help!

It sounds like you should be able to use hg fetch for this. It'll pull the changes from the server, merge, and then automatically commit the merge. It does prompt for merge conflicts as well. It's included with Mercurial, so just add
fetch =
to your hgrc, and you should be all set. It doesn't automatically push, but that's usually a bad idea anyway. You would typically want to run tests and resolve any merge problems before pushing your code out to everyone else.

Are the merges really taking that much time? If they're "unrelated changes" doesn't it just take a blink?
Someone already suggested fetch and someone else will probably suggest rebase, but personally I consider merging to be coding, and want it to be manual. It takes almost no time and it's an opportunity to give a good message like "Pulling in Jane's work half-way through my FooBar work" (instead of the useless commit messages fetch provides).

Related

Mercurial: devs work on separate folders, why do they have to merge all the time

I have four devs working in four separate source folders in a mercurial repo. Why do they have to merge all the time and pollute the repo with merge changesets? It annoys them and it annoys me.
Is there a better way to do this?

Assuming the changes really don't conflict, you can use the rebase extension in lieu of merging.
First, put this in your .hgrc file:
[extensions]
rebase =
Now, instead of merging, just do hg rebase. It will "detach" your local changesets and move them to be descendants of the public tip. You can also pass various arguments to modify what gets rebased.
Again, this is not a good idea if your developers are going to encounter physical merge conflicts, or logical conflicts (e.g. Alice changed a feature in file A at the same time as Bob altered related functionality in file B). In those cases, you should probably use a real merge in order to properly represent the relevant history. hg rebase can be easily aborted if physical conflicts are encountered, but it's a good idea to check for logical conflicts by hand, since the extension cannot detect those automatically.

Your development team are committing little and often; this is just what you want so you don't want to change that habit for the sake of a clean line of commits.
#Kevin has described using the rebase extension and I agree that can work fine. However, you'll also see all the work sequence of each developer squished together in a single line of commits. If you're working on a stable code base and just submitting quick single-commit fixes then that may be fine - if you have ongoing lines of development then you might not won't want to lose the continuity of a developer's commits.
Another option is to split your repository into smaller self-contained repositories.
If your developers are always working in 4 separate folders, perhaps the contents of these folders can be modularised and stored as separate Mercurial repositories. You could then have a separate master repository that brought all these smaller repositories together within the sub-repository framework.

Mercurial is distributed, it means that if you have a central repository, every developer also has a private repository on his/her workstation, and also a working copy of course.
So now let's suppose that they make a change and commit it, i.e., to their private repository. When they want to hg push two things can happen:
either they are the first one to push a new changeset on the central server, then no merge will be required, or
either somebody else, starting from the same version, has committed and pushed before them. We can see that there is a fork here: from the same starting point Mercurial has two different directions, thus a merge is required, even if there is no conflict, because we do not want four different divergent contexts on the central server (which by the way is possible with Mercurial, they are called heads and you can force the push without merge, but you still have the divergence, no magic, and this is probably not what you want because you want to be able to checkout the sum of all the contributions..).
Now how to avoid performing merges is quite simple: you need to tell your developers to integrate others changes before committing their own changes:
$ hg pull
$ hg update
$ hg commit -m"..."
$ hg push
When the commit is made against the latest central version, no merge should be required.
If they where working on the same code, after pull and update some running of tests would be required as well to ensure that what was working in isolation still works when other developers work have been integrated. Taking others contributions frequently and pushing our own changes also frequently is called continuous integration and ensures that integration issues are discovered quickly.
Hope it'll help.

Help understanding the benefits of branching in Mercurial

I've struggled to understand how branching is beneficial. I can't push to a repo with 2 heads, or 2 branches... so why would I ever need/use them?

First of all, you can push even with two heads, but since you probably don't want to do that, the default behavior is to prevent you from doing it. You can, however, force the push to go through.
Now, as for branching, let's take a simple scenario in a non-distributed version control system, like Subversion.
Let's assume you have a colleague that is working in the same project as you. The current latest changeset in the Subversion repository is revision 100, you both update to this locally so that now both of you have the same files.
Ok, now your colleague has already been working on his changes for a couple of hours now, and so he commits. This brings the central repository up to revision 101. You're still on revision 100 locally, and you're still working on your changes.
At some point, you complete, and you want to commit, but Subversion won't let you. It says you have to update first, so you start the update process.
The update process wants to take your changes, and pretend you actually started with revision 101 instead of 100. If your changes are not in conflict with whatever it was your colleague committed, all is hunky dory, but if your changes are in conflict, you have a problem.
Now you have to merge your changes with his changes, and things can go haywire. For instance, you might end up merging one file OK, the second file OK, or so you think, and then the third file, and you suddenly discover that you've got some of the details wrong, it would've been better to merge the second file differently.
Unless you made a backup of your changes before updating, and sooner or later you will forget, you have a problem.
Now, the above scenario is actually quite common. Well, perhaps not the merging part, it depends on how many is working in the same area or files at the same time, but the "must update before committing" part is quite common with Subversion.
So how does Mercurial do it?
Well, Mercurial commits locally, it doesn't talk to any remote repository at all, so it won't stop you from committing.
So, let's try the above scenario again, just in Mercurial this time.
The tipmost changeset in the remote repository is revision 100. You both have cloned this down, and you're both starting to work on the changes, from revision 100.
Your colleague completes his changes and commits, locally. He then pushes his changeset up to the central repository, bringing the tip there up to revision 101.
You then complete your changes, and commit, also locally, and then you want to push, but you get the error message you've already discovered, and is asking about.
So how is this different?
Well, your changes are now committed, there is no way, unless you try really hard to accidentally lose them or destroy them.
Here's the 3 repositories in play and their current state:
Colleague ---98---99---100---A
Central ---98---99---100---A
You ---98---99---100---B
If you were to push, and was allowed to do this (or force the push through), the Central repository would look like this:
Central ---98---99---100---A
\
+--B
Two heads. If your colleague now pulled, which one should he continue working from? This question is the reason Mercurial will by default prevent you from causing this.
So instead you pull, and you get the above state in your own repository.
In other words, you can chose to impact your own repository and create multiple heads there, but you are not imposing that problem on anyone else.
You then merge, the same type of operation you had to do in Subversion, except your changeset is safe, it was committed, and you won't accidentally corrupt or destroy it. If, mid-merge, you want to start over, you can, nothing lost, no harm done.
After the merge, your local repository looks like this:
You ---98---99---100---A----M
\ /
+--B--+
This is now safe to push, and if your colleague now pulls, he knows that he has to continue from the M changeset, the one that merged his and your changes.
The above description is what happens due to Mercurials distributed nature.
You can also name branches, to make them more permanent. For instance, you might want to name a branch "stable", to signal that any changesets on that branch have been thoroughly tested and is safe for release to customers or to put into production. Then you would only merge changes onto that branch when said testing has been completed.
The nature, however, is the same as the above description. Whenever more than one person works on a project with Mercurial, you will get branches, and that's a good thing.

Whenever more than one clone of a repo is made and commits are made in those clones, branches happen, whether you name them by using the hg branch command or not. My philosophy is, you might as well give them a name. It makes things less confusing.
A good explanation of mercurial branches: http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/

Simplest workflow for non-developers using mercurial, working on different files, without having to think about merging?

I currently use SVN for a number of things that aren't exactly code, for instance xml files, report templates, miscellaneous files, etc. I have several non-developers who are comfortable using TortoiseSVN for this. They typically work as follows:
Person A - does an SVN Update on the folder of interest to them. Or perhaps just on a single file.
Person A - edits whichever file(s) they're working on. Perhaps add or remove files.
Person B - someone else is probably working on different files at this point
Person A - does an SVN Commit to save their changes to the repository.
Very occasionally they'll hit conflicts where more than one person has edited a file. Almost always this is just because they forgot step #1. Because they're always working on separate files, there are (almost) never real conflicts. As long as they do step #1 first everything works fine.
I'd like to move to Mercurial, however something holding me back is the prospect of having do 'merge' all the time, because Mercurial looks at the state of the entire repository, not just the files of interest at a particular time. e.g. the workflow would be like this:
Person A - does a pull and update on the repository. (let's assume there are no local changes so this is straightforward).
Person A - edits whichever file(s) they're working on. Perhaps add or remove files.
Person B - someone else edits, commits, and pushes a different file at this point
Person A - commits changes. Tries to push. Gets an error about multiple heads.
Person A - does a pull and update. update doesn't work: merge required.
Person A - does a merge. If using TortoiseHg it's a bit confusing working out what to click on to do the merge. I guess this is simpler on the command line, provided there are no complications.
Person A - commits the merge.
Person A - pushes the changes.
My resistance is that there are more steps, and the merge step is somewhat hard to get your head around if you're not a developer. Is there a way I can put these steps together to make the process nice and simple?

"Very occasionally they'll hit conflicts where more than one person has edited a file. Almost always this is just because they forgot step #1. Because they're always working on separate files, there are (almost) never real conflicts. As long as they do step #1 first everything works fine."
If this is the case why do you want to use a DVCS? Mercurial is great, but the benefits of a DVCS come from the ability to merge and fork and the ease of doing either, if your workflow requires neither why would you want to switch toolset?

Sounds like the rebase extension might work for you. The workflow becomes:
hg clone
make changes
hg commit
hg pull --rebase
hg push
The local revisions get "rebased" onto the latest tip on pull, which avoids the merge.

One possible approach is to have a point person who does all the real work of merging. I'm not a big fan of letting everyone push to one shared repos, expecially if they don't know what they are doing. An alternative approach is that A has local repos A, B has local repos B, and there is repos S, which combines A and B. Then, don't let A or B push to S. Instead let an expert pull from A and B, and do the merging in S. Then A and B never have to push to S. If they coordinate with the expert, then he/she will already have merged their changes into S by the time they pull updates from S, and so A and B will not have to merge either when pulling. This is actually the default mode in which DVCS works, since by default all repositories are read-only except by their owner.

Correct (best-practise?) procedure to stay in sync with a remote Mercurial repository?

As a former user of Subversion, we've decide to move over to Mercurial for SCM and it is confusing us a little. Although Mercurial is a distributed SCM tool we are using a remote repo to keep changes we make backed up on a server but we are finding a few teething troubles.
For example, when two or three of us work on our local repo's, we commit then push to the remote repo, we find that a lot of heads(?) are created. This confused the hell out of us and we had to do some merging etc to sort it out.
What is the best way to avoid so many heads and to keep a remote repo in sync with a number of developers?
Today, i've been working like this:
Change a File.
Pull from remote repo.
Update local working copy.
Merge? (why?)
Commit my changes to local repo.
Push to the remote repo.
Is this the best proceedure?
Although this has worked fine today, i can't help that feeling that i'm doing it wrong! To be honest i don't understand why merging even needs to be done at the pull stage because other people are working on different files?
Other than to tell me to RTFM have you any tips for using Mercurial is such a way? Any good online resources for information on why we get so many heads?
NOTE: I have read the manual but it doesn't really give much detail and i don't think i want to start another book at the minute.

You should definitely find some learning resources.
I can recommend the following:
hginit.com
Tekpub: Mercurial
As for your concrete question, "is this the best procedure", then I would have to say no.
Here's some tips.
First of all, you don't need to stay "in sync" with the central repository at all times. Instead, follow these guidelines:
Push from your local repository to the central one when you're happy with the changes you've committed. Remember, this can be several changesets
Pull if you need changes others have done right away, ie. there's a bugfix a colleague of yours has fixed, that you need, in order to continue with your own work.
Pull before push
Merge any extra heads you pulled down with your own changes, before you push, or continue working
In other words, here's a typical day.
You pull the latest changes when you come in in the morning, so that you got an up to date local clone. You might not always do this, if you're in the middle of bigger changes that you didn't finish yesterday.
Then you start working. You commit small changesets with isolated changes That isn't to say that you split up a larger bugfix into many smaller commits just because you modify multiple files, but try to avoid fixing more than one bug at a time, or implementing more than one feature at a time. Try to stay focused.
Then, when you're happy with all the changesets you've added locally, you decide to push to the server. When you try to do this, you get an abort message saying that extra heads would be pushed to the server, and this isn't allowed, so the push is aborted.
Instead you pull. This can always be done, but will of course now add extra heads in your local clone, instead of at the server.
Then you merge, the extra head that you got from the server, with your own head, the one that you created during the day by committing new changesets to your clone. You resolve any merge conflicts.
Then you push, and now it should succeed. On the off chance that someone has managed to push more changesets to the central repository while you were busy merging, you will get another abort and have to rinse and repeat.
The history will now show multiple parallel branches of development, but should always stay at max 1 head in your central repository. If, later on, you start using named branches, you can have 1 head per named branch, but try to avoid this until you get the hang of just the default branch.
As for why you need to merge? Well, Mercurial always work with revisions that are snapshots of the entire project, which means two branches, even though they contain changes to different files, are really considered two different versions of the entire project, and you need to tell Mercurial that it should combine them to get back to one version.

For one, you can pull at any time; pulling does just add changesets to your repo, but not change your local working files (except if you have enabed the post-pull update).
Merging is necessary if someone else has commited changes to the same branch you're currently working on. This created an implicit branch, and merging merely brings them back together. You can see this nicely with the "railroad track" in the repository view. Basically, as long as you don't merge, you stay on your own "private" track, and when you want to add your changes (can be any amount of changesets) you merge it back into the destination branch (typically "default"). It's painless - nothing like merging in older SVN versions!
So the workflow is not as rigid as you displayed it; it's more like this:
Pull as much as you like
Make changes and commit locally as often as you like
When your changes should be integrated, merge with the destination branch (can be a lower revision than the newest), commit and push
This workflow can be tuned somewhat, for instance by using named branches and sometimes by using rebase. However, you and your team should decide on the workflow to be used; Mercurial is quite flexible in this regard.

http://hginit.com has a good tutorial.
In particular, you'll find the list of steps you have here: http://hginit.com/02.html (at the bottom of the page)
The difference between those steps and yours is that you should commit after step 1. In fact you will typically commit several times on your local repository before moving onto the pull/merge/push step. You don't need to share every commit with the rest of developers right away. It'll often make sense to do several related changes and then push that whole thing.

Is constant merging with Mercurial common practice? Something wrong with this workflow?

My company is switching to Mercurial, and we're coming from Subversion.
We're noticing that we're having to do a LOT of merging in our workflow. For instance, if I change a file, commit, pull, update, push, and then my co-worker changes a file, commits, pulls, and updates, he gets a "crosses branches" error and has to do an hg merge. We're having to do this pretty much every single time we want to push to our central repository.
Is something wrong with our workflow?? It seems wrong that in our history for a given file there are going to be a ton of history entries that say "Merging with [changeset id]" "Merging with [changset id]."
Is this just the way it is? Or are we doing something wrong?

There's nothing wrong with this. The vast majority of merges should be automatic. You did create two heads when you both made changes stemming from the same revision and going in divergent directions - your changes might or might not conflict.
If you want to eliminate the "merge" changesets (which aren't actually a problem), you can change/pull/rebase/commit/push instead of change/commit/pull/merge/commit. In other words, before committing your changes, rebase them to the new tip.

If the merges are being executed without manual merge resolution, then I would say mercurial and your workflow are behaving as designed.

It is common as you really do need to get your repos in a consistent state. One thing that speeds it up is instead of
hg pull; hg update
use fetch
hg fetch
This will intelligently do a pull and than either an update or a merge. It comes with mercurial so it's basically a matter of editing your .hgrc to add a line like so:
[extensions]
hgext.fetch=
If the merge goes cleanly you won't even notice it happening. That was a big help in my workflow.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008