Something like VSS labels in Mercurial? - mercurial

What I would like to do is something like this:
When I deliver my code, I would like to label my part of the source with “delivered”.
If other developers follow the same convention, it should be possible to extract from the scm all the code labelled delivered.
When I deliver code again, it should be possible to move or replace the label delivered on my source.
What is the closest thing you can do in Mercurial, or what is the best convention to follow to keep track of code in specific states as described above?
(I haven’t actually done a lot of this in VSS, I might actually be mistaken about how it works)
Appendix 1:
I would like that we work in one branch as far as possible, commit and pull/push as much as possible. Then we need something like labels to keep track of code in a certain state.

It sounds like you might want named branches. Each developer can work on their own branch and merge their branch to a "delivered" branch when ready. When all the developers have merged their branches, the release can be given a final check and tagged.

If I understand this right, you want to label individual files in more than one revision. E.g. in revision 1 you label all files in /lib1/ as "delivered", and in revision 2 you label all files in /src/ as "delivered". Now if somebody comes along and tells hg to give him all the code that is "delivered", you want in /lib1/ the files from revision 1, and in /src/ the files from revision 2.
If this is what you want, it is not possible in hg (and for a good reason: it is considered bad practice). In such a scenario, you could perhaps split the single repo into 2 subrepos "lib1" and "src", versioning both separately. You can then register a certain combination of these 2 sub-repositories by means of a commit in the super-repository, followed by a tag "delivered" in the super-repository.
If you do not want something like this, I don't understand the purpose of labeling only a subset of files in a revision. In this case standard tagging is sufficient, as you can move public tags in Mercurial, anyway (and with history!).

The closest thing in Mercurial is Tag.

Related

How to keep mercurial graph "flat"

I have a "main" repository, I clone it and make some changes in the clone. In the meantime, there are other changes in "main", so I pull them and merge them in my clone. I make more changes in the clone, and merge any other new changes from "main". This gives me this graph:
When I finish my work in the clone, I push to the main repository and now the graph in "main" looks like this:
I know they're topologically the same, but to me the first one is clearer (this one is a very simple case, but things could get more complicated).
Is there any way to prevent this? I've found this question about reordering the graph after the fact, but I was thinking maybe there's a problem in my workflow or something I could change to prevent it.
The problem is that the graph is sorted by the revision number, not by the revision date. This is effectively sorting by the date/time that the revisions appeared in the current repository. There is an outstanding issue on the thg project to allow sorting of the list by revision date but one of the developers said that this change would need to involve hiding the graph as he thinks that the re-write of the grapher would be too complicated for too little gain (the issue is here).
There is no workflow involving merge that I know of to fix it because the revisions will never be in the same order on different repositories if work is carried out on more than one repo.
One way to neaten up the tree would be to use rebase instead of merge after pulling your changes. This would result in a single branch with no merges as it re-writes history to make it appear as though your draft revisions were implemented after the changes that you just pulled. If you want to read up on rebase, that info is here.

Given a file, how to find out which revision in a mercurial repository this is?

Assume that there is a file under hg version control. I have a particular version of that file, and I would like to find out in which revision this file was in this version.
I suspect that there are two possible ways to do this.
Do hg update in a loop and diff the file against subsequent versions (sloooow, but should work).
Make Mercurial put the rev number in a, say, comment in the second line of the file right before committing. From what I have read, a precommit hook might be of use. Then I don't have to compare anything, just look at the file itself (I'm assuming no-one will change this, of course, but this is rather safe assumption in my case).
My use case is a joint paper, written in LaTeX, with two coauthors who have no idea about version control at all, but I prefer to use it (for obvious reasons). We communicate by email, and there's effectively a human-based lock system ("I will not work on this file until you send me the next version, ok?"). The only problem that arises is that I'm sending version X to author B to proofread, then author C sends me a corrected version Y and I commit it into my repo, then author B sends his corrections Z (to version X) and I'm starting to get lost-but I can check the attachment in the email sent to B, and I only need to find out which revision it is.
So, my question is: which of the two ideas above would be better, or maybe there's yet another one to help me deal with this mess?
hg archive is good method for future work, but I can suggest at least 3 alternative work-styles and 1 fix for find-correct-version with updates
Future work
You can use separate named branches for co-authors and default for merged results, send co-author always head from his branch, update his branch after getting corrections (you'll always know, that you sent) and merge branches to default
One branch, revision-of-coworker marked with bookmark, which you later move to next point
Mercurial keywords considered somehow as a "feature of last resort", but in your case it's obvious and usable solution: just add keyword with hash-id in file (defaul extension instead of hook - easier and more reliable)
Current state
For finding changeset with source of file, you can try to use bisect (example) and test in test-script, f.e, CRC of file (you have needed CRC of unversioned file, check versioned file across history)
If you're happy to rely on finding the emails you send the reviewers, why not just include the revision hashes in them along with the files?
You can get this for almost zero extra effort by generating your attachment using hg archive, which will create a file containing 1) your files for review, and 2) .hg_archival.txt, complete with revision hash.
Though I'd be surprised if there isn't a more elegant way, even if your collaborators are dead-set against using version control.

Using mercurial on divergent branches

What is a good workflow for using mercurial with two long-running branches that are slightly divergent (i.e. I never intend to entirely merge them back
together)?
In my case, this is CMS software that has been customized differently for two
different web sites. I started with projectA, and once that was working cloned it to projectB and make further tweaks to both A and B to customize them. Now I want to develop some features that show up in both A and B, without merging the site-specific customizations. How?
hg push will push everything, so that won't work
Transplant appears to give me different changeset hashes, which worries me
I feel like maybe the repositories should be set up differently, but I'm not
sure how.
As Thilo comments, the common part would be best developed (and published in A and B) as a third repo declared as a SubRepo.
That way, you respect the first two repos which are independent (one evolution on A doesn't always mean an evolution on B), and you can develop the common part in subrepo C.
A solution for Mercurial might be if you can put the different areas in files that can be in .hgignore, but then they won't be versioned, so that may not be so good.
Another way is to just use 1 repo, and set a global flag, and use template A or B depending on the flag, and / or include different code source file depending on the flag. If the difference is small, then can use if-then-else inside the same file.
You can use hg push to push the changes back together, but you don't necessarily have to merge all the changesets into the trunk. Just take the ones you want.
As stated above, a subrepo is probably the best option. Another alternative would be to have a third branch with the common work, and merge from that branch to projectA and projectB (but never back to the common branch).
This alternative is more likely to have accidents (merging the wrong way) but you might find that it is easier to set up and get working quickly.

A mercurial merge chose the wrong changes, what is the correct way to fix this?

Changes were made to our .vcproj to fix an issue on the build machine (changeset 1700). Later, a developer merged his changes (changes 1710 through 1715) into the trunk, but the mercurial auto-merge overwrote the changes from 1700. I assume this happened because he chose the wrong branch as the "parent" of the merge (see part 2 of the question).
1) What is the "correct" mercurial way to fix this issue, considering out of all the merged files, only one file was merged incorrectly, and
2) what should the developer have done differently in order to make sure this didn't occur? Are there ways we can enforce the "correct" way?
Edit: I probably wasn't clear enough on what happened. Developer A modified a line in our .vcproj file that removed an option for the compiler. His check-in became changeset 1700. Developer B, working from a previous parent (let's say changeset 1690), made some changes to completely different parts of the project, but he did touch the .vcproj file (just not anywhere near the changes made by Developer A). When Developer B merged his changes (becoming changes 1710 through 1715), the merge process overwrote the changes from 1700.
To fix this, I just re-modified the .vcproj file to include the change again, and checked it in. I just wanted to know why Mercurial thought that it shouldn't keep the changes in 1700, and whether or not there was an "official" way to fix this.
Edit the second: Developer B swears up and down that Mercurial merged the .vcproj file without prompting him for conflict resolution, but it is of course possible that he's just misremembering, in which case this whole exercise is academic.
I will address the 2nd part of you question first...
If there is a conflict, the automated merge tools should force the programmer to decide how the merge happens. But the general assumption is that a conflict will involve two edits to the same set of lines. If somehow a conflict arises because of edits to lines that are not close to each other the automated merge will blithely choose both of the edits and a bug will appear.
The general case of a merge tool always merging properly is very hard to solve, and really can't be with current technology. Here is an example of what I mean from C:
int i; // Someone replaces this with 'short i' in one changeset stating
// that a short is more efficient.
// ... lots of code;
// Someone else replaces all the 65000s with 100000s in another changeset,
// saying that more precision is needed.
for (i = 0; i < 65000; ++i) {
integral_approximation_piece(start + i/65000.0, end + (i + 1) / 65000.0);
}
No merge tool is going to catch this kind of conflict. The tool would have to actually compile the code to see that those two parts of the code have anything to do with eachother, and while that would likely be enough in this case, I can construct an example that would require the code to be run and the results examined to catch the conflict.
This means that what you really ought to do is rigorously test your code after a merge, just like you should after any other change. The vast majority of merges will result in obvious conflicts that a developer will have to resolve (even though that resolution is often fairly obvious), or will merge cleanly. But the very few merges that don't fit either category can't easily be handled in an automated fashion.
This can also be fixed by development practices that encourage locality. For example a coding standard that states "Variables should be declared near where they're used.".
I'm guessing that .vcproj files are particularly prone to this problem since they are not well understood by developers and so if conflicts do appear they will not be sure what to do with them. My guess is that this happened and your developer simply did a revert back to the revision (s)he checked in.
As for part 1...
What to do in this case depends a lot on your development process. You can either strip the merge changeset out and redo it, though that won't work very well if lots of people have already pulled it, and it will work especially poorly if there are lots of changesets that have already been checked in that are based on the merge changeset.
You can also check in a new change that fixes the problem with the merge.
Those are basically your two options.
The tone of your post seems to me to indicate that you may have some politics surrounding this issue in your organization, and people are blaming this error on the frequent merges of Mercurial. So I will point out that any change control system can have this problem. In the case of Subversion, for example, every time a developer does an update while they have outstanding changes in their working directory, they are doing a merge, and this kind of problem can arise with any merge.
In mercurial a merge doesn't have a single parent, it by definition has two and only two parents. When someone is merging they're making two choices:
What two changesets will constitute the two changes
Which of those changesets will be the left-parent and which will be the right-parent
Of those two questions the first is very important, and the second barely matters at all, though it took me a while to come to understand that.
You select the left-parent by using hg update X. That changes the output of hg parents (or in newer versions hg summary) and essentially determines what's in your working directory before the merge.
You select the right-parent by using hg merge Y. That says merge X (the working directory's parent) with changeset Y. As a special case, if there are only two heads in your repository and your parent is already one of them then Y will default to the the other.
I'd have to see your resulting graph to know just what the developer did, but it's possible he didn't update to one head or another before invoking merge, which would have him merging one head with some point back in history.
If your developer picked the right parents for the merge then the left vs. right doesn't much matter -- the only real difference is that when one uses hg diff or hg log -p or some other command that shows the patch for a merge changeset, it's displayed relative to the left-parent. That's, however, mostly a factor in display only. Functionally they're pretty much identical.
Assuming your developer picked the right changesets then what he should have done was test the result of the merge before committing it. Merging is software development, not an annoying VCS side effect, and not testing before committing is the error.
Fixing
To fix this, just re-do the merge correctly. Use hg update to set one parent, use hg merge to pick the other. Make sure your current working directory is correct and then commit. You can get rid of his bad merge using something like hg strip or better, just close down his branch with hg commit --close-branch after updating to it.
Avoiding
You say "mercurial auto-merge", but mercurial doesn't really auto-merge. It does a premerge which is an extremely cautious combination of obvious changes, but it's so careful it won't even merge for you if each merge parent adds code in the same region because it can't know which block of code you'd rather have first.
You can disable this premerge entirely or on a file-by-file basis using the merge tool configuration options:
https://www.mercurial-scm.org/wiki/MergeToolConfiguration?highlight=premerge

best practices in mercurial: branch vs. clone, and partial merges?

...so I've gotten used to the simple stuff with Mercurial (add, commit, diff) and found out about the .hgignore file (yay!) and have gotten the hang of creating and switching between branches (branch, update -C).
I have two major questions though:
If I'm in branch "Branch1" and I want to pull in some but not all of the changes from branch "Branch2", how would I do that? Particularly if all the changes are in one subdirectory. (I guess I could just clone the whole repository, then use a directory-merge tool like Beyond Compare to pick&choose my edits. Seems like there ought to be a way to just isolate the changes in one file or one directory, though.)
Switching between branches with update -C seems so easy, I'm wondering why I would bother using clone. I can only think of a few reasons (see below) -- are there some other reasons I'm missing?
a. if I need to act on two versions/branches at once (e.g. do a performance-metric diff)
b. for a backup (clone the repository to a network drive in a physically different location)
c. to do the pick&choose merge like I've mentioned above.
I use clone for:
Short-lived local branches
Cloning to different development machines and servers
The former use is pretty rare for me - mainly when I'm trying an idea I might want to totally abandon. If I want to merge, I'll want to merge ALL the changes. This sort of branching is mainly for tracking different developers' branches so they don't disturb each other. Just to clarify this last point:
I keep working on my changes and pull my fellow devs changes and they pull mine.
When it's convenient for me I'll merge ALL of the changes from one (or all) of these branches into mine.
For feature branches, or longer lived branches, I use named branches which are more comfortably shared between repositories without merging. It also "feels" better when you want to selectively merge.
Basically I look at it this way:
Named branches are for developing different branches or versions of the app
Clones are for managing different contributions to the same version of the app.
That's my take, though really it's a matter of policy.
For question 1, you need to be a little clearer about what you mean by "changes". Which of these do you mean:
"I want to pull some, but not all, of the changesets in a different branch into this one."
"I want to pull the latest version of some, but not all, of the files in a different branch into this one."
If you mean item 1, you should look into the Transplant extension, specifically the idea of cherrypicking a couple of changesets.
If you mean item 2, you would do the following:
Update to the branch you want to pull the changes into.
Use hg revert -r <branch you want to merge> --include <files to update> to change the contents of those files to the way they are on the other branch.
Use hg commit to commit those changes to the branch as a new changeset.
As for question 2, I never use repository clones for branching myself, so I don't know. I use named branches or anonymous branches (sometimes with bookmarks).
I have another option for you to look into: mercurial queues.
The idea is, to have a stack of patches (no commits, "real" patches) ontop of your current working directory. Then, you can add or remove the applied patches, add one, remove it, add another other one, etc. One single patch or a subset of them ends up to be a new "feature" as you probably want to do with branches. After that, you can apply the patch as usual (since it is a change). Branches are probably more useful if you work with somebody else... ?