Patch corruption or loss after rebase - mercurial

I just lost all of the changes in a Mercurial patch (fortunately, I had a backup), and I would like to figure out what went wrong.
The Setup
I had a pair of patches, call them patch1.diff and patch2.diff. They were both based on revision 123, but affected completely different files, with no overlap. So, my repository looked something like this in TortoiseHg (where p is a patch and r is a regular revision):
Graph Rev Branch Tags Message
p 125 develop patch2.diff Change to existing file baz.php
p 124 develop patch1.diff Add new files foo.php and bar.php
r 123 develop Last committed changeset
|
r 122 develop Old changes
...
What I Did
I wanted to switch the order of the patches, because my work on patch2.diff was complete and I wanted to commit those changes. So I tried rebasing that patch onto revision 123. That didn't work, and I ended up with something like this:
Graph Rev Branch Tags Message
r Working directory - not a head revision!
r 126 develop Change to existing file baz.php
|
p | 125 develop patch2.diff Change to existing file baz.php
|
p | 124 develop patch1.diff Add new files foo.php and bar.php
|
r-+ 123 develop Last committed changeset
|
r 122 develop Old changes
...
That was clearly wrong. I now had a revision 126 with the same changes as those in patch2.diff, but I also still had a patch2.diff, which wasn't rebased as I expected. On top of that, I was getting the "not a head revision" message, even though there weren't actually any changes in my working directory.
So I stripped revision 126. At that point, things went completely off the rails, leaving me with this:
Graph Rev Branch Tags Message
p 125 develop patch2.diff Change to existing file baz.php
p 124 develop patch1.diff
r 123 develop Last committed changeset
|
r 122 develop Old changes
...
patch1.diff still appeared in TortoiseHg, but the changes and commit message were gone. I tried hg qpush --all, and got these messages:
applying patch1.diff
unable to read patch1.diff
I couldn't even find patch1.diff on my file system anymore. Ultimately, I had to run hg qdelete --keep patch1.diff and then restore my lost changes from offsite backups.
I ended up where I wanted to be, but nearly lost hours of work on a new feature. I was able to recover only because I had an offsite backup of the new files. That was terrifying.
The Question
What in the world happened? Why did I lose patch1.diff? I could understand if I lost the changes in patch2.diff given the way I used hg strip, but I have no idea why patch1.diff got nuked.

You stumbled over the issues why mq might very soon not be recommended anymore. It wants to retain control over csets it controls and it looses that, when you modify history under mq control. Thus mq does not work well with rebase, strip, histedit...
The better way is to simply stop using mq at all. Make your default phase for new commits secret (or draft). Commit your patches as normal changesets - then mq cannot interfer with proper working of rebase and what you did try to do would simply have worked.
hg rebase -s125 -d123
hg rebase -s124 -d126
(given the state of your repo as in the first quote, just asusming r124, r125 are normal csets, not under mq control)
And if you're a little daring, you take a look at the evolve extension which is very useful for people who maintain patch queues with respect to upstream repos or juggle draft changesets with collaborators.
See http://www.logilab.org/blogentry/88203 for an introduction to mercurial phases

Related

Using Mercurial, is it possible to see multiple commits in a PR (Pull Request)?

Using git and GitHub, I was able to work on one feature, using a branch, and commit and push, let's say 12 commits to a PR (Pull Request) on GitHub, and then reviewers can see the final diff with all 12 commits, or see each commit, so as to see how I made it barely working, and then each refinement one after another. (or maybe bug fix or refactoring).
However, at my company now, we are using Mercurial, and it seems I have to use hg amend to do it. But this way, the intermediate commits cannot be viewed independently.
If I commit one after another, the diff would show up as 2 or more Diffs, instead of one big diff for the PR. I cannot really ask team members to review my PR by going through 12 Diffs so I have to use hg amend and make it one Diff without the ability to see intermediate Diffs. Is this how Mercurial works, or is it just how the PR is set up for the Mercurial? That is, it really can work like how GitHub works too? (I think the diffs are shown by Phabricator but I am not entirely sure).
PRs (and MRs) are just fully GitHub additions, didn't existed as separate entities in pure Git before, and have to be supported by git-clients due to "git is github", a common but completely erroneous notion today
hg commit --amend or hg amend from evolve aren't and can't be hg-equivalent of PRs by concept (squashing instead of adding external history)
In pure Mercurial-way you can use hg in --bundle FILE FORK for getting changes from FORK-URL and inspecting changes in form of set of changesets or as one changeset with relevant hg diff in GUI or console and pull (or drop bundle) into repo after review

hg: fix branching without merges

I have repository with wrong branching. Branches was used like tag for commit message, showing related part of project (eg data, search - like features). No merging was used. Each next commit, if it is about other feature than prev, just reopens branch with desired name.
It looks like
o changeset: 717
| branch: default
|
o changeset: 523
| branch: search
|
o changeset: 357
| branch: data
|
o changeset: 397
| branch: data
|
o changeset: 789
| branch: default
Whats the right way to stop that ugliness?
Update to each branch-head and merge with last commit consequentially? But there nothing to merge actually.
Or update to each head, commit with "close branch", update to next head ... and at last update to default?
Firstly, as Aaron pointed out, and assuming these are misunderstandings by the commiters, you will need to teach the others working on that repo when its appropriate ( by your definition of appropriate ) to create named branches, and when it isn't. I imagine that would solve most of your problems.
If you want to bring everything back in line, and the branches are currently open, you can close each branch and then merge them into default. If you don't merge after closing, you will have a 'dangling' commit in your history graph, but that may or may not matter to you. Also, if you only have 1 'active' branch, which you can check with
hg branches
then all the commits are currently merged into that active branch, and you may only need to merge that branch into default to bring things in line. However you may still want to close the inactive branches, which will leave you with dangling commits if you don't merge them after closing.
If you want to avoid having commits with branchnames, you could look at using bookmarks, which are sort of an amalgamation of branching and tagging.
If you want to minimise branching ( even anonymous branching ) on your repo, you could use rebase, which is a built in extension that you have to enable. You can use the rebase switch when pulling someone else's changes, to automatically move your commits after the commits you are pulling in and make the history linear. You cannot do this once your commits have already been pulled or pushed elsewhere however ( well, you could, but it would make things very complicated depending on what happens with your pulled commits ). However using rebase would depend on how you feel about keeping accurate history.
To clean up the existing repo could be a bit of a mission. You could use rebase to move all the commits to a common branch, such as default, however since it changes your history it will no longer be 'compatible' with the existing clones of that repo. So you would have to use rebase to alter your history, and then have everyone ( including servers ) reclone the altered repo and trash the old one ( they cannot even pull any outstand commits from the old one to the altered repo ).
From the look of the history (just one line), people really just gave each commit a different branch name without actually ever doing any work on two different branches at the same time.
While I haven't seen this before, it's a interesting and clean way to sort changesets into features without creating an ugly history.
As for "stopping" this, just stop reopen branches. There is nothing that you need to do to "clean" the history since there aren't actually any heads which need merging. Therefore, Mercurial will tell you "nothing to merge" if you check out default and say hg merge search.

hg bundle not working

I'm trying to create a bundle for a remote team. They have a copy of the depot from revision 892 and we are currently on revision 1119.
First I tried patches, but that created a ton of files that botched up when trying to apply them (usually on the merge submits)... and our repository is 17GB in size, so I'm trying to create a delta patch, thus figured hg bundle was perfect for this.
I generated a bundle via:
>hg bundle --rev 1119 --base 892 depot-892-to-1119.bundle
This created a bundle file that is 350MB, which is acceptable and feels right.
But when we apply it to the the destination depot that only goes to revision 892 it barfs on:
E:\dest-depot>hg unbundle -u depot-892-to-1119.bundle
adding changesets
transaction abort!
rollback completed
abort: 00changelog.i#e5cc33458251: unknown parent!
And so far this is similar to several other questions I have seen while searching, but I'll take it one step further.
I looked up e5cc33458251 in the source (bigger depot) and it shows up as revision 930 which is clearly after rev 892, but specifies this is the reason for the failure. Of course the destination depot doesn't have the revision. That is why I created the bundle in the first place.... so I'm not really sure why this one is causing me problems.
Now we do have a number of branches in the depot and rev 892 was tipped on a "Patch 2.7" branch and not default. I do not know if this should cause a problem. Eventually that patch branch was merged back into default in rev 999.
930 was actually a very small and trivial change to code and was also in "Patch 2.7" branch. There were actually 2 Patch 2.7 lines in the revision graph and they were merged together in 932. But again, nothing strange.
I am not seeing the problem here. Any ideas on what kind of a bundle I should be generating? Or if I should be going a different path?
It sounds like you're doing this essentially right, so let's check a few possible gotchas:
Are you aware that revision numbers aren't portable across clones? It's entirely possible that "their" 892 is different from yours. So you should find out what their latest revision is by nodeid and use that as the parameters to base.
I get that with their being remote using hg's internal protocol to actually transfer the data might not be feasible, but if you can get them to stand up a hg serve for a short while you can just do:
hg bundle ../depot-to-them.bundle http://THEIR_IP:8000
Then you'll have exactly the right bundle to get them everything they need without having to have them send you their nodeids.
Those aside the only other bit of info that might be worth mentioning is that by using --rev X --base Y you're saying "I want to send all the ancestors of X that they don't have if they only have Y and its ancestors", so if there's a branch that's not yet merged into X you're not going to be sending it, even if locally the revision numbers are between X and Y. That won't, however prevent the bundle from being applied, so it's more of a good-to-understand rather than a possible cause of your troubles.

Mercurial: Fix a borked history

So working on a project recently (by myself - no other developers), I somehow managed to seriously bork the history with some (apparently) bad merges from cloned repositories.
What I would like to do - need to do - is fix this by just deleting the last 8 commits (according to hg glog)
Yes, I have made a few changes to the code after the borking began, however, only a few tweaks here or there - nothing I can't fix fresh from memory.
How can I get rid of the last 8 commits and start over from where I messed up?
Make a clone of your repository - when you do this, you can specify the last commit that should be cloned.
So, if your repository has 100 changesets and you want to get rid of changesets 93 to 100, just do this:
hg clone -r 92 BadRepository CleanRepository
--> the CleanRepository will only contain changesets 1 to 92.
If you use TortoiseHG, you can do the same in the Clone dialog (there is a textbox "Clone to revision:")

How does Mercurial work with many developers?

I look at Mercurial repositories of some known products, like TortoiseHg and Python, and even though I can see multiple people committing changes, the timeline always looks pretty clean, with just one branch moving forward.
However, let's say you have 14 people working on the same product, won't this quickly get into a branch nightmare with 14 parallel branches at any given time?
For instance, with just two people, and the product at changeset X, now both developers start working on separate features on monday morning, so both start with the same parent changeset.
When they commit, we now have two branches, and then with 14 people, we would quickly have 10+ (might not be 14...) branches that needs to be merged back into the default.
Or... What am I not seeing here? Perhaps it's not really a problem?
Edit: I see there's some confusion as to what I'm really asking about here, so let me clarify.
I know full and well that Mercurial easily handles multiple branches and merging, and as one answer states, even when people work on the same files, they don't often work on the same lines, and even then, a conflict is easily handled. I also know that if two people end up creating a merge hell because they changed a lot of the same code in the same files, there's some overall planning failure here, since we've placed two features in the exact same place onto two developers, instead of perhaps trying them to work together, or just giving both to one developer in the first place.
So that's not it.
What I'm curious about is how these open source project manage such a clean history. It's not important to me (as one comment wondered) that the history is clean, I mean, we do work in parallel, that the repository is able to reflect that, so much the better (in my opinion), however these repositories I've looked at doesn't have that. They seem to be working along the Subversion model where you can't commit before you've updated and merged, in which case the history is just one straight line.
So how do they do it?
Are they "rebasing" the changes so that they appear to be following the latest tip of the branch even though they were originally committed a bit back in the branch history? Transplanting changesets to make them appear to' having been committed in the main branch to begin with?
Or are the projects I've looked at either so slow (at the moment, I didn't look far back in the history) at adding new things that in reality they've only been working one person at a time?
Or are they pushing changes to one central maintainer who reviews and then integrates? It doesn't look like that since many of the projects I looked at had different names on the changesets.
Or... What am I not seeing here?
Perhaps it's not really a problem?
It's not really a problem. In a large project even when people work on the same feature, they don't usually work on the same file. When they work on the same file, they don't usually modify the same lines. And when they modify the same lines, then a merge should be done manually (for the affected lines).
This means in practice that 80+% of the merges can be done automagically by Mercurial itself.
Let's take an example:
you have:
[branch 1] [branch2]
\ /
\ /
[base]
Edit: for clarity, by branch I refer here to unnamed branches.
If you have a file changed in branch 1 but the same file in branch 2 is the same as in base, then the version in branch 1 is chosen. If the file is modified in both branch 1 and branch 2 the files are merged line by line using the same algorithm: if line 1 in file1 in branch 1 is different than line 1 in file1 in base but branch 2 and base have the line 1 equal, line 1 in branch 1 is chosen (and so on and so forth).
For the lines that are modified in both branches, Mercurial interrupts the automated merging process and prompts the user to choose which lines to use, or edit the lines manually.
Since deciding which lines to use is best done by the person(s) who modified those lines, a good practice is to have the person that implemented a feature perform the merge. That means that if me and you work on the same project, I implement my feature, then make a pull from a central/common repository (get the latest version that everyone uses), then merge my new version with the pulled changes, then publish it to the common repository (at this point, the common repository has one main branch, with my merged changes into it). Then, you pull that from the server and do the same with your changes.
This implies that everyone is capable of doing whatever they want in their local repository, and the common/official repository has one branch. It also means that you need to decide on a time frame when people should merge their changes in.
I used to have three or four repositories on my machine already compiled on different product versions (different branches of the repository) and a few different branches in my main repository (one for refactoring, one for development and so on). Whenever I would bring one branch to a stable state (say - finish a refactoring) I would pull from the server, merge that branch into the pulled changes, then push it back to the server and let anyone know that if they made any changes to the affected files, they should pull first from the server.
We used to synchronize implemented features every Monday morning and it took us about an hour to merge everything, then make a weekly build on the server to give to QA (on bad days it would take two member of the team two hours or so, then everyone would pull the week's changes on their machine and use them as a new base for the week). This was for an eight-developers team.
In your updated question it seems that you are more interested in ways of tidying up the history. When you have a history and want to make it into a single, neat, straight line you want to use rebase, transplant and/or mercurial queues. Check the docs out for those three and you should realise the workflow for how its done.
Edit: Since Im waiting for a compile, here follows a specific example of what I mean:
> hg init
> echo test > a.txt
> hg addremove && hg commit -m "added a.txt"
> echo test > b.txt
> hg addremove && hg commit -m "added b.txt"
> hg update 0 # go back to initial revision
> echo test > c.txt
> hg addremove && hg commit -m "added c.txt"
Running hg glog now shows this (diverging) history with two branches:
# changeset: 2:c79893255a0f
| tag: tip
| parent: 0:7e1679006144
| user: mizipzor
| date: Mon Jul 05 12:20:37 2010 +0200
| summary: added c.txt
|
| o changeset: 1:74f6483b38f4
|/ user: mizipzor
| date: Mon Jul 05 12:20:07 2010 +0200
| summary: added b.txt
|
o changeset: 0:7e1679006144
user: mizipzor
date: Mon Jul 05 12:19:41 2010 +0200
summary: added a.txt
Do a rebase, making changeset 1 into a child of 2 rather than 0:
> hg rebase -s 1 -d 2
Now lets check history again:
# changeset: 2:ea0c9a705a70
| tag: tip
| user: mizipzor
| date: Mon Jul 05 12:20:07 2010 +0200
| summary: added b.txt
|
o changeset: 1:c79893255a0f
| user: mizipzor
| date: Mon Jul 05 12:20:37 2010 +0200
| summary: added c.txt
|
o changeset: 0:7e1679006144
user: mizipzor
date: Mon Jul 05 12:19:41 2010 +0200
summary: added a.txt
Presto! Single line. :)
Also note that I didnt do a merge. When you rebase like this, you will have to deal with merge conflicts and everything just like as if you did a merge. Because thats pretty much what happens under the hood. Experiment with this in a small test repo. For example, try changing the file added in revision 0 rather than just adding more files.
I'm a Mercurial developer, so let me explain how we/I do it.
In the Mercurial project we accept contributions in form of patches sent to the mailinglist. When we apply those with hg import, we do an implicit rebase to the tip of the branch we are working on. This help a lot with keeping the history clean.
As for my own changes, I use rebase or mq to linearize things before I push them, again to keep the history tidy. It's basically a matter of doing
hg push # abort: creates new remote head
hg pull
hg rebase
hg push
You can combine the pull and rebase if you like (hg pull --rebase) but I've always liked to take one step at a time.
By the way, there are some disagreements about this practice of linearizing the history -- some believe that the history should show how things really happened, with all the branches and merges and whatnot. I find that as long as you don't mess with public changesets, then it's okay and useful to linearize history.
The Linux kernel is stored in thousands of repositories and probably millions of branches, and this doesn't seem to pose a problem. For large projects you need a repository strategy (e.g., the dictator–lieutenants strategy), but having many branches is the main strength of the modern DVCSes and not a problem at all.
Yes, we'll have to merge and to avoid heads on the main repository, merging should be done on the child repositories by the developer.
So before you push your code to the parent repository you first pull the latest changes, merge on your side and (try to) push. This should avoid unwanted heads in the master repo
I don't know how the TortoiseHg team does things, but you can use Mercurial's rebase extension to "detach" a branch and drop it on the top of the tip, creating a single branch.
In practice, though, I don't get concerned about multiple branches, as long as I don't see more heads than there should be. Merging is not really a big deal.