In Mercurial, how can a line disappear without a merge? - mercurial

Using mercurial, I've run into an odd problem where a line from one committer vanished at some point in the timeline and I can't explain why it is.
Log looks like this:
changeset: 172:xyz123
parent: 76:pqr345
user: barry baggings
date: Mon Jan 16 0:12:43 2012 +0000
summary: blah blah blah
changeset: 171:opq123
parent: 165:abc234
user: mary moggings
date: Mon Feb 01 1:12:41 2012 +0000
summary: naw naw naw
Running: hg diff -r 171 -r 172 gives this for abc.py (header omitted):
print "context line1"
- print "i need this line!"
print "context line2"
The mod in question print "i need this line! was definitely introduced in 171:opq123 but it's gone again in 172:xyz123,
But a diff between 76 and 172 shows no mods to abc.py! How can Barry leapfrog Mary's change like this?
Am I just misunderstanding how all this works? I have a pretty decent background in things like CVS and SVN but DVCSs make my head hurt sometimes... can someone explain?
I am slightly suspicious that it's because we're on mercurial 1.7.1 - could it be a bug?

These are separate heads, with 172 not based on 171. The graph, as produced if you have the graphlog extension enabled and run hg glog, or as you can see visually from hgweb if you run hg serve, open it in a browser and click on the "graph" link, shows it perhaps more clearly than the "parent" values of the changesets.
o changeset: 172:xyz123
| parent: 76:pqr345
| user: barry baggings
| date: Mon Jan 16 0:12:43 2012 +0000
| summary: blah blah blah
|
| o changeset: 171:opq123
| | parent: 165:abc234
| | user: mary moggings
| | date: Mon Feb 01 1:12:41 2012 +0000
| | summary: naw naw naw
| |
This shows clearly that changeset 172 is not based on changeset 171.Thus in comparing them, the change can have taken place elsewhere. You're not comparing an ancestor and descendant, you're comparing cousins of some degree.
So, while you introduced it in 171, because 172 is not based on 171, it doesn't get the addition of the line. A merge is in fact what you need to get the change, to make a changeset which is based both on 171 and 172 (and their ancestors, until they converge).
You may well have more heads that you want merged in; you can use hg heads to look at them. hg serve can also be very helpful for comprehending such things.

Related

How do I merge two heads on Mercurial?

Please, help me! I'm not familiar with Mercurial.
I have a message in my bitbucket telling me that my "default branch has multiple [two] heads", which is not what I want. The branches are both labeled "Default." I've tried hg merge and I get "abort: nothing to merge."
When I do hg heads, only one gets listed.
changeset: 4:fb6f0d961015
tag: tip
user: Name <my_email>
date: Fri Feb 07 03:39:23 2014 -0800
summary: Folder
Have you pulled changes from the server? It sounds like you have one head locally, but when your changes are combined with what is on the server it produces two heads...
If your changes 'C' are based on change 'B', but someone else has also made change 'D' based on 'B' ...
C D
| |
B /
|
A
You see A-B-C on your machine (only one head), but when it is pushed to the server it would create two.
If that is the case, you need to pull the latest changes from the server and then do the merge.
If you just performed a hg pull you probably just need to do
hg update -r <revision>
where is the one that resulted from pulling. In general you may find it convinient to combine the two operation with
hg pull -u <repository>

Mercurial, pulling a single branch

I have two Mercurial repos C1 and C2 which both derived from the same parent P some time ago but have since had separate lines of development. In addition, in C2, there is a named branch B2 which happened since the diversion. I want to pull only branch B2 into C1, which I can easily do with hg pull C2 --branch B2.
Now B2 branches off of some points in the default branch of the C2 repo. So those default changesets from C2 get pulled over into C1 even though I am only trying to pull branch B2. (I can understand that since they are ancestors of the B2 changesets).
After the above pull, I will have two heads on the default branch of C1, the original head and the head composed of those default changesets that got pulled over as a result of pulling B2 over. I want to leave the default branch of C1 unchanged, otherwise I have two heads, keep getting told that updates "cross branches" and telling me I have to merge. (I will be pulling new default branch things into C1 from other external repos going forward).
How can I do the above so that I do not have two heads on default?
Thinking about the problem, am I right in thinking you have something like this?
# 4[tip]:1 439255c536ee 2013-01-25 10:42 +0000 rob
| More changes on original default branch
|
| o 3 379f384c1d73 2013-01-25 10:41 +0000 rob
| | Changes on named branch B2
| /
| o 2:0 d225da266931 2013-01-25 10:40 +0000 rob
| | Changes on cloned default
| |
o | 1 7088660d3ba6 2013-01-25 10:41 +0000 rob
|/ Changes on original default branch
|
o 0 a02a921256b3 2013-01-25 10:39 +0000 rob
Project Start
The problem you're describing is that while you want to keep the B2 branch, you don't want the two heads on default:
$ hg heads default --style=compact
4[tip]:1 439255c536ee 2013-01-25 10:42 +0000 rob
More changes on original default branch
2:0 d225da266931 2013-01-25 10:40 +0000 rob
Changes on cloned default
I can see two possibilities to "fix" this. One is simply to close the "cloned" default branch (at least in your repo):
$ hg update 2
$ hg commit -m "Closing cloned default" --close-branch
$ hg heads default --style=compact
4[tip]:1 439255c536ee 2013-01-25 10:42 +0000 rob
More changes on original default branch
This would mean that if you issue an hg update default, it shouldn't be ambiguous, which I guess is part of the problem.
An alternative would be to do a "null merge" from the other default branch:
$ hg update
$ hg -y merge --tool=internal:fail 2
$ hg revert --all --rev .
$ hg resolve -a -m
$ hg commit -m "Merged cloned default"
The main difference in these approaches is that using the --close-branch option, you can still see the extra head when you use hg heads -c... it is still a head, but it simply has a flag set in the metadata to say it's closed. You can still update to the closed branch, and even commit changes to it. Doing a merge you won't see the head at all, as it will not longer be a head.
Both of these methods mean that you can still pull changes to the B2 branch - however, if in the repo you're pulling from they make changes on default and merge those into B2 (often used when bugfixing, etc), you will hit the same problem later on, and have to repeat the above.
I hope that makes sense. Of course, if you later push to another repo, you will push the closed/merged changesets too. It would be wise to clone your repo and try these approaches locally to check if that's what you actually want.

What tools or techniques are available to "datamine" my mercurial repository?

We have a 2,000,000 lines of code application in Mercurial. Obviously there is a lot of valuable information inside this repository.
Are there any tools or techniques to dig out some of that information?
For instance, over the history of the project, what five files have seen the most changes? What five files are the most different from what they were one year ago? Any particular lines of code seen a lot of churn?
I'm interested in that sort of thing and more.
Is there a way to extract this kind of information from our repository?
I don't know of any tools specifically made for doing this, but Mercurial's log templates are very powerful for getting data out of the system. I've done a bit of this sort of analysis in the past, and my approach was:
Use hg log to dump commits to some convenient format (xml in my case)
Write a script to import the xml into something queryable (database, or just work from the XML directly if it's not too big)
Here's an example hg log command to get you going:
mystyle.txt: (template)
changeset = '<changeset>\n<user>{author|user}</user>\n<date>{date|rfc3339date|escape}</date>\n<files>\n{file_mods}{file_adds}{file_dels}</files>\n<rev>{node}</rev>\n<desc>{desc|strip|escape}</desc>\n<branch>{branches}</branch><diffstat>{diffstat}</diffstat></changeset>\n\n'
file_mod = '<file action="modified">{file_mod|escape}</file>\n'
file_add = '<file action="added">{file_add|escape}</file>\n'
file_del = '<file action="deleted">{file_del|escape}</file>\n'
Example invocation using template and date range:
hg --repository /path/to/repo log -d "2012-01-01 to 2012-06-01" --no-merges --style mystyle.txt
Try the built-in hg churn extension. One thing I like to use it for, for example, is to see a monthly bar graph of commits like this:
> hg churn -csf '%Y-%m'
2014-02 65 *************************************
2014-03 22 *************
2014-04 52 ******************************
2014-05 67 ***************************************
2014-06 31 ******************
2014-07 29 *****************
2014-08 29 *****************
2014-09 61 ***********************************
2014-10 36 *********************
2014-11 23 *************
2014-12 32 ******************
2015-01 60 ***********************************
2015-02 20 ************
(might want to set up aliases if you find you're using the command often enough)

Mercurial: how to make Repository1 track Repository0 but pushing selective only certain tagged revisions/changesets

Using Mercurial. Working in private clones, making changes, pushing to master. Etc. That's fine.
(By the way, let me state that I am familiar with many version control systems ranging from the classics SCCS, RCS, CVS, SVN to DVCSs like BitKeeper, Git, Mercurial, passingly familiar with Monotone, Darcs, Bzr. Not so familiar with Perforce, ... I just mention this for people who may be able to explain equivalents and similarities between systems.)
Unfortunately, other members of the project think that I check in too often. I use the "check in early and often" approach, sometimes checking in more often than once every half hour. They don't want to see so many checkin messages in the log.
NOT QUESTION: Now, I have taken to the habit of making most of my changes on branches (although hg lacks retroactive branching), summarizing all of the changes when I merge my task branches back to the trunk. hg log -b default allows them to filter out my task branches, and works okay when the merge messages are meaningful. This is not my question.
NOT QUESTION: similarly, I know how to use history editing, etc., to remove my fine granularity changesets, so that the only changesets that get pushed are coarse grain. (Although, by the way, I often find the simplest thing to do is move .hg directories around.) History editing ias annoying, but I can do it.
MY QUESTION: my question is: I would like to keep track of my fine granularity changesets and log messages, even though I don't push them to the project master repository.
Q: how do I do this?
I.e. how do I keep two repositories tracking each other, but NOT push all of my changes to the master? Preferably tag changesrts not to be pushed. (By the way, I prefer to say "tag a revision set". Changeset sounds too much like patch.)
The way I do it now is to have
a) the project master repo
b) my personal master repo, with all of the fine grain changes
c) one or more workspaces
Where
a) clone the project master repo to get the workspace (using hg clone, yyeah, I know)
b) work in the workspace clone with fine grain checkins (I know this)
c) push from the workspace clone to my personal master
d) and, when it comes time to push to the master, edit the history, and push that.
Two troubles:
1) conflicts merging or pushing from the workspace clone to my personal master
- more precisely, conflicts when merging from the master repo with history edited to remove the fine grain checkins, to my personal master with the fine grain checkins preserved.
2) I would like to record what has been pushed to the project master in my personal master.
--
A very simple example.
A task branch, with fine grain checkins.
o changeset: 1022:
| branch: default
| parent: 1017
| parent: 1021
| summary: Merged task branch onto main, default, branch, with changes to FOO and BAR
|
o changeset: 1021
|\ branch: task-branch
| | parent: 1020
| | summary: Merged default branch to task-branch, with changes to FOO and BAR
| |
| o changeset: 1020
| | branch: task-branch
| | parent: 1018:
| | summary: yet another fine grain checkin on a branch changing BAR for a second time
| |
o | changeset: 1019
| | branch: task-branch
| | parent: 1018:
| | summary: yet another fine grain checkin on a branch changing FOO
| |
o | changeset: 1018
| | branch: task-branch
| | parent: 1017
| | summary: yet another fine grain checkin on a branch changing BAR a first time
| |
|/
o changeset: 1017
| branch: default
| summary: some other changeset on the default branch, i.e. trunk
|
The edited history, that gets pushed to the project master
o changeset: 1018: ...changed checksumm because checksum seems to include history
| branch: default
| parent: 1017
| summary: Merged task-branch onto main, default, branch, with changes to FOO and BAR.
| Removed task-branch-history from what got pushed to project repo,
| although may still be in personal repo
|
o changeset: 1017
| branch: default
| summary: some other changeset on the default branch, i.e. trunk
|
The problem is, when I try to merge an updated projecvt master, that may look like
o changeset: 1019
| branch: default
| parent: 1018
| summary: again, some other changeset on the default branch, i.e. trunk
|
o changeset: 1018: ...changed checksumm because checksum seems to include history
| branch: default
| parent: 1017
| summary: Merged task-branch onto main, default, branch, with changes to FOO and BAR.
| Removed task-branch-history from what got pushed to project repo, ]
| although may still be in personal repo
|
o changeset: 1017
| branch: default
| summary: some other changeset on the default branch, i.e. trunk
|
I get conflicts, even though changeset 1018 of the project master and 1022 in the personal master are exactly the same, in terms of file contents.
I've tried not changing the comments in the merge changesrt that gets pushed. Doesn't help, at least not reliably.
I wonder if the full history is included in the changesets.
I also wonder if there is some merge tool that can recognize same files, even though different history and log messages.
In general any workflow that involves editing history is not going to work well. It's just not the intended work mode, and no effort goes into making it work well.
Better options for you might be:
Versioned Mercurial Queues (hg qinit --create-repo) which lets you make multiple commits iterating on a single patch. You get the full history in your private patch repo and they only get the qfinished whole.
Rather than merge back your task branch and edit history, just re-apply the change as a single commit and push that. In your example it would be hg diff -r 1017 -r 1021 | hg import, which creates a new commit (1022) that is the sum of all the changes 1018::1021, inclusive. that should merge cleanly and be pushable without ever having to push any changesets on your task-branch out to them. You can even hg commit --close-branch on the task branch
Tell your coworkers to STFU. ;) High changeset granularity is good practice, and they should adapt.
Also, consider using bookmarks instead of named branches for your task-branches. Similarly the new phases feature makes it possible to mark changesets in your task branches (be they bookmark-branches or old-style named-branches) as secret so they can't be accidentally pushed.

Mercurial: applying changes one by one to resolve merging issues

I recently tried to merge a series of changeset and encountered a huge number of merging issues. Hence I'd like to to try to apply each changeset, in order, one by one, in order to make the merging issues easier to manage.
I'll give an example with 4 problematic changesets (514,515,516 and 517) [in my real case, I've got a bit more than that]
o changeset: 517
|
o changeset: 516
|
o changeset: 515
|
o changeset: 514
|
|
| # changeset: 513
| |
| o changeset: 512
| |
| o
| |
| o
| |
| o
|/
|
|
o changeset 508
Note that I've got clones of my repos before pulling the problematic changesets.
When I pull the 4 changesets and try a merge, things are too complicated to resolve.
So I wanted to pull only changeset 514, then merge. Then once I solve the merging issue, pull only changeset 515 and apply it, etc. (I know the numbering shall change, this is not my problem here).
How am I supposed to do that, preferably without using any extension? (because I'd like to understand Mercurial and what I'm doing better).
Is the way to go generate a patch between 508 and 514 and apply that patch? (if so, how would I generate that patch)
Answers including concrete command-line example(s) most welcome :)
I haven't tested this, but merging individual changesets should be easy enough:
$ hg update -r 513
$ hg merge -r 514
... # do your conflict resolution and commit
$ hg merge -r 515
... # repeat
I haven't tested either, but it should work to just hg up to each of the foreign changesets one after the other. I don't think you have to commit between the updates.
As a bonus, the command line example you wished :-)
hg up 513
hg up 514
hg up 515
hg up 516
hg up 517