In Mercurial, can I merge just some files between two branches? [duplicate] - mercurial

This question already has answers here:
Mercurial: Merging one file between branches in one repo
(5 answers)
Closed 2 years ago.
Reading up on Mercurial, it seems to always branch and merge the complete repositories.
Is it possible to just merge some files from one branch to another? (For example I may only wish to merge in the files that fix a given bug.)
Likewise can I cherry pick some change sets, but still have a correct merge record, so if a complete merge is done later it is correct?
I am coming from a perforce “mindset” so may be thinking about this the wrong way.

Yes, Mercurial always branches and merges the whole tree. You don't have the "flexibility" that something like perforce gives you to select individual files for a merge. This is a good thing (trust me). Changesets are atomic (you can't split them) and immutable (you can't change them). Hence this needs a little bit of a mindset change.
Changesets should be targetted at one task, and one task only. If you're fixing a bug, nothing else goes in the changeset apart from the bug fix. You've then got a changeset which documents that bug fix, and you haven't got the problem of wanting to split it. It wouldn't make sense to want to. Half a bug fix is often worse than no bug fix.
When it comes to merging that there's a couple of options:
One school of thought says you should go back to where the bug was introduced. Fix it. Commit (making a small anonymous branch), and merge that forward onto whatever head you want it on (dev, stable, release, whatever). This isn't always practical though.
Another method is fixing the bug in the release branch, and then merging to the development branch. This normally works well.
Alternatively you could fix it at the head of your development branch, but then if you merge it onto your release branch you'll bring over all your development changes. This is where graft (new in 2.0) and the older transplant extension come into play. They allow you to "cherry-pick" a single or range of changesets from another branch and place them on another branch.

Reading up on Mercurial, it seems to always branch and merge the
complete repositories.
Yes
Is it possible to just merge some files from one branch to another? (For example I may only wish to merge in the files that fix a given bug.)
Just touch only "some files" in needed changeset and merge branch with this changeset in head with another branch or transplant in any time
Likewise can I cherry pick some change sets, but still have a correct merge record, so if I complete merge is done later it is correct?
Yes, you can transplant| any changesets to another branch, applied state will be remembered and changes will not be duplicated on final merge

Related

Mercurial: devs work on separate folders, why do they have to merge all the time

I have four devs working in four separate source folders in a mercurial repo. Why do they have to merge all the time and pollute the repo with merge changesets? It annoys them and it annoys me.
Is there a better way to do this?
Assuming the changes really don't conflict, you can use the rebase extension in lieu of merging.
First, put this in your .hgrc file:
[extensions]
rebase =
Now, instead of merging, just do hg rebase. It will "detach" your local changesets and move them to be descendants of the public tip. You can also pass various arguments to modify what gets rebased.
Again, this is not a good idea if your developers are going to encounter physical merge conflicts, or logical conflicts (e.g. Alice changed a feature in file A at the same time as Bob altered related functionality in file B). In those cases, you should probably use a real merge in order to properly represent the relevant history. hg rebase can be easily aborted if physical conflicts are encountered, but it's a good idea to check for logical conflicts by hand, since the extension cannot detect those automatically.
Your development team are committing little and often; this is just what you want so you don't want to change that habit for the sake of a clean line of commits.
#Kevin has described using the rebase extension and I agree that can work fine. However, you'll also see all the work sequence of each developer squished together in a single line of commits. If you're working on a stable code base and just submitting quick single-commit fixes then that may be fine - if you have ongoing lines of development then you might not won't want to lose the continuity of a developer's commits.
Another option is to split your repository into smaller self-contained repositories.
If your developers are always working in 4 separate folders, perhaps the contents of these folders can be modularised and stored as separate Mercurial repositories. You could then have a separate master repository that brought all these smaller repositories together within the sub-repository framework.
Mercurial is distributed, it means that if you have a central repository, every developer also has a private repository on his/her workstation, and also a working copy of course.
So now let's suppose that they make a change and commit it, i.e., to their private repository. When they want to hg push two things can happen:
either they are the first one to push a new changeset on the central server, then no merge will be required, or
either somebody else, starting from the same version, has committed and pushed before them. We can see that there is a fork here: from the same starting point Mercurial has two different directions, thus a merge is required, even if there is no conflict, because we do not want four different divergent contexts on the central server (which by the way is possible with Mercurial, they are called heads and you can force the push without merge, but you still have the divergence, no magic, and this is probably not what you want because you want to be able to checkout the sum of all the contributions..).
Now how to avoid performing merges is quite simple: you need to tell your developers to integrate others changes before committing their own changes:
$ hg pull
$ hg update
$ hg commit -m"..."
$ hg push
When the commit is made against the latest central version, no merge should be required.
If they where working on the same code, after pull and update some running of tests would be required as well to ensure that what was working in isolation still works when other developers work have been integrated. Taking others contributions frequently and pushing our own changes also frequently is called continuous integration and ensures that integration issues are discovered quickly.
Hope it'll help.

How to revert a file to an earlier version in Mercurial?

I made some changes to a file and committed it. (In fact there were several commits).
Then I wanted to revert to the earlier version and lose all those changes.
I did something like:
hg update -r nnn where nnn was the reversion number of the changeset I wanted to go back to.
That worked. I was happy.
Then, later, I had to push my local repository to the remote. But when I did hg push I got a message about there being two heads on this branch and one of them not being known to the remote repositiory. It suggested I merge before pushing. (I think).
I googled this and found a page that suggested I do "hg merge". I did that. Now the resultant file is back to where I started. I.e. it contains all the changes I wanted to throw away.
Where did i go wrong?
EDIT:
I have found this post Mercurial — revert back to old version and continue from there
where it says:
If later you commit, you will effectively create a new branch. Then
you might continue working only on this branch or eventually merge the
existing one into it.
That sounds like my case. Something went wrong at the merging stage it seems. Was I on the wrong branch when I did "hg merge"?
You're past this point now but if it happens again, and it's just a single file you want to revert then consider:
hg revert --rev REVISION_YOU_LIKED path/to/just/one/file.txt
That doesn't update you whole repository to a different revision, and it doesn't create any commits. It just takes a single file in your working directory and makes it look like it used to. After doing that you can just commit and you're set.
That's not the way to go if you want to undo all the changes you've made to all files, but for reverting a single file use revert and avoid multiple heads and merging entirely.
No, nothing went wrong at the merge stage – Mercurial did exactly what you asked it to...
What merge means is that you take the changes on your current branch, and the changes on the 'other' branch, and you merge them. Since your original changes were in the 'other' branch, Mercurial carefully merged them back into your current branch.
What you needed to do was to discard the 'other' branch. There are various ways of doing that. The Mercurial help pages discuss the various techniques, but there are pointers in other SO questions: see for example Discard a local branch in Mercurial before it is pushed and Remove experimental branch.
(Edit) Afterthought: the reason you got a warning about there being two heads on the branch is because having two heads is often a temporary situation, so pushing them to a remote repository is something you don't want to do accidentally. Resolutions are (i) you did mean to push them, so use --force to create two heads in the remote repository; (ii) ooops!, you meant to merge them before pushing, so do that; or (iii) ooops!, you'd abandoned the 'other' one, so get rid of it. Your case was (iii).

What is the best way to do a code review across multiple commits, with TortoiseHg?

The problem that I'm running into is that I have some code reviews to do, with ~10 commits per review. It's an active repo with constant commits from developers. I have TortoiseHg filtering my changesets so that I am looking only at the ones that I care about.
What I would like to see is the difference between the changeset before the first change, and the last (without all the non-related changesets showing). I simply want to see the final results of all these changes. I don't care that there was some horrible code in changeset 1, that was fixed in 3. I just want to see the diff of what ultimately got merged through all these changesets.
I feel like I'm missing the obvious, and this isn't a bright question. Nevertheless, I'm asking anyways. Anyone?
I'm not sure about 1.1.8, as I'm using the 1.9/2.0 candidate release, but I believe you could left-click on changeset1, right-click on revision3 and select visual Diff. This should open your diff tool of choice and only show you the diffs between the 2 versions.
When I did this in the newer tortoise, it opened BeyondCompare in directory compare mode, with revision1 on one side, and revision2 on the other.
Don't merge in between commits and diff off the developers clone between start and finish changesets.
Or If merges occured, update and merge everything and then take the entire codebase (or just changed files) and dump it onto a clean tip clone (make sure you are working with the same version to avoid overwriting anything). Recommit all at once.

A mercurial merge chose the wrong changes, what is the correct way to fix this?

Changes were made to our .vcproj to fix an issue on the build machine (changeset 1700). Later, a developer merged his changes (changes 1710 through 1715) into the trunk, but the mercurial auto-merge overwrote the changes from 1700. I assume this happened because he chose the wrong branch as the "parent" of the merge (see part 2 of the question).
1) What is the "correct" mercurial way to fix this issue, considering out of all the merged files, only one file was merged incorrectly, and
2) what should the developer have done differently in order to make sure this didn't occur? Are there ways we can enforce the "correct" way?
Edit: I probably wasn't clear enough on what happened. Developer A modified a line in our .vcproj file that removed an option for the compiler. His check-in became changeset 1700. Developer B, working from a previous parent (let's say changeset 1690), made some changes to completely different parts of the project, but he did touch the .vcproj file (just not anywhere near the changes made by Developer A). When Developer B merged his changes (becoming changes 1710 through 1715), the merge process overwrote the changes from 1700.
To fix this, I just re-modified the .vcproj file to include the change again, and checked it in. I just wanted to know why Mercurial thought that it shouldn't keep the changes in 1700, and whether or not there was an "official" way to fix this.
Edit the second: Developer B swears up and down that Mercurial merged the .vcproj file without prompting him for conflict resolution, but it is of course possible that he's just misremembering, in which case this whole exercise is academic.
I will address the 2nd part of you question first...
If there is a conflict, the automated merge tools should force the programmer to decide how the merge happens. But the general assumption is that a conflict will involve two edits to the same set of lines. If somehow a conflict arises because of edits to lines that are not close to each other the automated merge will blithely choose both of the edits and a bug will appear.
The general case of a merge tool always merging properly is very hard to solve, and really can't be with current technology. Here is an example of what I mean from C:
int i; // Someone replaces this with 'short i' in one changeset stating
// that a short is more efficient.
// ... lots of code;
// Someone else replaces all the 65000s with 100000s in another changeset,
// saying that more precision is needed.
for (i = 0; i < 65000; ++i) {
integral_approximation_piece(start + i/65000.0, end + (i + 1) / 65000.0);
}
No merge tool is going to catch this kind of conflict. The tool would have to actually compile the code to see that those two parts of the code have anything to do with eachother, and while that would likely be enough in this case, I can construct an example that would require the code to be run and the results examined to catch the conflict.
This means that what you really ought to do is rigorously test your code after a merge, just like you should after any other change. The vast majority of merges will result in obvious conflicts that a developer will have to resolve (even though that resolution is often fairly obvious), or will merge cleanly. But the very few merges that don't fit either category can't easily be handled in an automated fashion.
This can also be fixed by development practices that encourage locality. For example a coding standard that states "Variables should be declared near where they're used.".
I'm guessing that .vcproj files are particularly prone to this problem since they are not well understood by developers and so if conflicts do appear they will not be sure what to do with them. My guess is that this happened and your developer simply did a revert back to the revision (s)he checked in.
As for part 1...
What to do in this case depends a lot on your development process. You can either strip the merge changeset out and redo it, though that won't work very well if lots of people have already pulled it, and it will work especially poorly if there are lots of changesets that have already been checked in that are based on the merge changeset.
You can also check in a new change that fixes the problem with the merge.
Those are basically your two options.
The tone of your post seems to me to indicate that you may have some politics surrounding this issue in your organization, and people are blaming this error on the frequent merges of Mercurial. So I will point out that any change control system can have this problem. In the case of Subversion, for example, every time a developer does an update while they have outstanding changes in their working directory, they are doing a merge, and this kind of problem can arise with any merge.
In mercurial a merge doesn't have a single parent, it by definition has two and only two parents. When someone is merging they're making two choices:
What two changesets will constitute the two changes
Which of those changesets will be the left-parent and which will be the right-parent
Of those two questions the first is very important, and the second barely matters at all, though it took me a while to come to understand that.
You select the left-parent by using hg update X. That changes the output of hg parents (or in newer versions hg summary) and essentially determines what's in your working directory before the merge.
You select the right-parent by using hg merge Y. That says merge X (the working directory's parent) with changeset Y. As a special case, if there are only two heads in your repository and your parent is already one of them then Y will default to the the other.
I'd have to see your resulting graph to know just what the developer did, but it's possible he didn't update to one head or another before invoking merge, which would have him merging one head with some point back in history.
If your developer picked the right parents for the merge then the left vs. right doesn't much matter -- the only real difference is that when one uses hg diff or hg log -p or some other command that shows the patch for a merge changeset, it's displayed relative to the left-parent. That's, however, mostly a factor in display only. Functionally they're pretty much identical.
Assuming your developer picked the right changesets then what he should have done was test the result of the merge before committing it. Merging is software development, not an annoying VCS side effect, and not testing before committing is the error.
Fixing
To fix this, just re-do the merge correctly. Use hg update to set one parent, use hg merge to pick the other. Make sure your current working directory is correct and then commit. You can get rid of his bad merge using something like hg strip or better, just close down his branch with hg commit --close-branch after updating to it.
Avoiding
You say "mercurial auto-merge", but mercurial doesn't really auto-merge. It does a premerge which is an extremely cautious combination of obvious changes, but it's so careful it won't even merge for you if each merge parent adds code in the same region because it can't know which block of code you'd rather have first.
You can disable this premerge entirely or on a file-by-file basis using the merge tool configuration options:
https://www.mercurial-scm.org/wiki/MergeToolConfiguration?highlight=premerge

best practices in mercurial: branch vs. clone, and partial merges?

...so I've gotten used to the simple stuff with Mercurial (add, commit, diff) and found out about the .hgignore file (yay!) and have gotten the hang of creating and switching between branches (branch, update -C).
I have two major questions though:
If I'm in branch "Branch1" and I want to pull in some but not all of the changes from branch "Branch2", how would I do that? Particularly if all the changes are in one subdirectory. (I guess I could just clone the whole repository, then use a directory-merge tool like Beyond Compare to pick&choose my edits. Seems like there ought to be a way to just isolate the changes in one file or one directory, though.)
Switching between branches with update -C seems so easy, I'm wondering why I would bother using clone. I can only think of a few reasons (see below) -- are there some other reasons I'm missing?
a. if I need to act on two versions/branches at once (e.g. do a performance-metric diff)
b. for a backup (clone the repository to a network drive in a physically different location)
c. to do the pick&choose merge like I've mentioned above.
I use clone for:
Short-lived local branches
Cloning to different development machines and servers
The former use is pretty rare for me - mainly when I'm trying an idea I might want to totally abandon. If I want to merge, I'll want to merge ALL the changes. This sort of branching is mainly for tracking different developers' branches so they don't disturb each other. Just to clarify this last point:
I keep working on my changes and pull my fellow devs changes and they pull mine.
When it's convenient for me I'll merge ALL of the changes from one (or all) of these branches into mine.
For feature branches, or longer lived branches, I use named branches which are more comfortably shared between repositories without merging. It also "feels" better when you want to selectively merge.
Basically I look at it this way:
Named branches are for developing different branches or versions of the app
Clones are for managing different contributions to the same version of the app.
That's my take, though really it's a matter of policy.
For question 1, you need to be a little clearer about what you mean by "changes". Which of these do you mean:
"I want to pull some, but not all, of the changesets in a different branch into this one."
"I want to pull the latest version of some, but not all, of the files in a different branch into this one."
If you mean item 1, you should look into the Transplant extension, specifically the idea of cherrypicking a couple of changesets.
If you mean item 2, you would do the following:
Update to the branch you want to pull the changes into.
Use hg revert -r <branch you want to merge> --include <files to update> to change the contents of those files to the way they are on the other branch.
Use hg commit to commit those changes to the branch as a new changeset.
As for question 2, I never use repository clones for branching myself, so I don't know. I use named branches or anonymous branches (sometimes with bookmarks).
I have another option for you to look into: mercurial queues.
The idea is, to have a stack of patches (no commits, "real" patches) ontop of your current working directory. Then, you can add or remove the applied patches, add one, remove it, add another other one, etc. One single patch or a subset of them ends up to be a new "feature" as you probably want to do with branches. After that, you can apply the patch as usual (since it is a change). Branches are probably more useful if you work with somebody else... ?