split a hunk in hg; change tolerance

split a hunk in hg; change tolerance - mercurial

I use the hunk-by-hunk or hunk selection approach to committing: instead of commuting all changes I made to a file, I commit related parts. E.g. I wrote a function and a test, compiled to ensure it works and then commit the function and the test separately. For this I use built-in functionality in tortoiseHg and RecordExtention when in the console.
Now I have two edits separated by only one unchanged line, thus falling in hg's tolerance of one hunk. I want to commit only the former for now. How?

The record extension doesn't let you split hunks further, but the less-standard CRecord extension does.
Just to put it out there, but what you're doing is usually considered bad practice because it guarantees that you haven't run the unit tests on the files as they're being committed. That, of course, doesn't apply in all environments.
If the reason you're leaving some parts uncommitted is because they're local-only changes you always in in place (passwords, paths, etc.) they're a good candidate for a Mercurial Queues "patch". Then you'd be able to 'pop' them off, commit the whole file, and then 'push' them back on.

Related

Mercurial: Prevent commiting certain changes

I'm sure we've all done that thing where you temporarily hot-wire part of your application while you test something. We really don't want to commit such changes though.
Usually I mark such lines with a comment reminding me not to commit this change. But is there some way I can program Mercurial itself to refuse to commit any line containing a certain text fragment? (Not the entire file, just the marked line.) Is there some extension or something that does that?

The answer is a clear 'No, but...' (or 'Yes, but...' - depends on how you see it).
If you always indicate WIP lines in the same manner, I recommend to write and install a (local) commit hook which will fail, if any such WIP lines are detected in the changeset.
See https://www.mercurial-scm.org/wiki/Hook and https://www.mercurial-scm.org/wiki/UsefulHooks
In order to commit only some hunks, you can make use of the record extension (it's a default extension, just needs enabling). It allows you to cherry-pick the hunks at commit time. But it will fail at cherry-picking if the WIP code and the 'actual' code are in the same hunk.

Contrary to (good and correct in terms of sense) platetmaker's answer I'll suggest another way:
Enable, learn and use MQ Extension (see also Tutorial page) or, as lightweight alternative, Shelve Extension
Store your WIP (modifications in working directory) as MQ-patches or shelves locally until they aren't finished

Can I "break" an hg working copy in such a way that it must be reverted before commit?

I have a utility script that will configure the local developer instance to mimic a specific production one - copy over the correct connection strings, client files, etc - thereby making it easy to investigate certain issues.
Because this is all done locally this changes files in the working directory. The first thing the script does therefore is print out a big fat warning to caution developers not to commit and revert changes when they are done. However it is possible to forget or miss this warning as cruft from my build system scrolls by and I wonder if its possible to go further.
Is it possible to do something to the hg repo so that a commit will be rejected and the developer would have to revert first?
I realize there are some variables to the question as far as revert what and commit what but given that I'm not even sure that this is possible, I am willing to take close-with-caveats for an answer.

This is exactly what hooks are for. You would need a precommit hook on every repo or possibly the pretxnchangegroup.
By creating a precommit hook (script) that checks for a specific file or a specific change, you can fail the commit and print out whatever warning you need. The return value of the script indicates to mercurial if the transaction is valid or not.
Use the following generic sample to check for the presence of certain files that would be committed, before accepting or rejecting the changesets.

Mercurial: Automatically tagging a build

In a mercurial set up, I'd like to automatically tag certain builds based on continuous integration scripts. For example, a tag such as branchName-buildId whenever a build of a branch is deployed, or perhaps latest-stable whenever a build passes all integration tests.
However, I'm worried that the straightforward approach of simply calling hg tag will cause problems:
Some tags may be duplicate - i.e. latest-stable. I don't really care which build gets tagged in this situation, but I don't want any conflicts because a script can't resolve those.
Tags cause commits. However, this means that those commits need to be pushed and they need to be robust in the face of concurrent pushes by humans and other scripts. In particular, the automatic push can create additional heads, which is Not Good. But by the time the additional head is detected (at push) the local tag commit has already happened, and even though the new heads are likely trivially mergeable, sometimes tags cause conflicts.
How can I automatically let the CI server tag a build robustly? Here it's more important that the end result is consistent (i.e. that it doesn't mess up the CI server or the repo), and it's less important that tags are reliably applied in the face of duplicates or conflicts (which should be very unlikely anyhow).

I think you're right to be cautious. Robots aren't always the best citizens, and can often do silly things.
What you end up doing depends on what you see the tags being used for. For example, if you only see the CI system using them, then I'd suggest keeping them local. No pull/push/merge issues at all.
Some tags may be duplicate - i.e. latest-stable. I don't really care which build gets tagged in this situation, but I don't want any conflicts because a script can't resolve those.
If a tag is already defined, and you call hg tag again, it will fail unless you force it, but what this does is add a newer, later definition of the same tag, and the latest one wins. On one hand this is good, because the merge is simple, but think about the case when you do:
hg update -r latest-stable
hg update -r latest-stable
hg update -r latest-stable
hg update -r latest-stable
Each time you'll update to the version you'll get a version before the tag was made (as normal), and at that version latest-stable will point to the previous latest-stable. The result is that this sequence of commands will move you back through time.
Hence I'd say it's better either to have unique tags (i.e. stable-2013-02-18) or tag in two commits; One that removes the old tag, and one to add the new one.
hg update -r latest-stable # You're now at the commit that removed the tag.
hg update -r latest-stable # This one will error because tag doesn't exist
Tags cause commits. However, this means that those commits need to be pushed and they need to be robust in the face of concurrent pushes by humans and other scripts. In particular, the automatic push can create additional heads, which is Not Good. But by the time the additional head is detected (at push) the local tag commit has already happened, and even though the new heads are likely trivially mergeable, sometimes tags cause conflicts.
The CI robot should tag; pull; merge (if necessary); push. If the merge fails, don't push, raise an alarm. If the push fails (i.e. there's been more changesets in the time it took to merge), pull and merge again. I'd just make sure your script is very explicit about the revisions it's merging. This process should leave you with no extra heads.
I believe Mercurial treats the .hgtags file differently for merging because it knows about the content, so conflicts should be very rare. Also, tag commits are, in general, easy to merge because all that changes is .hgtags, so a merge from the CI head should never conflict. The only reason it could is because someone else is using the same tag names as the CI server, and if they are doing that then they need to have honey poured on their keyboard so they can do any more damage.
The situation I can see causing problems is if you're doing CI tagging on multiple heads with the same tag names. e.g. Development and release branches both have CI run on them, both have tests-clean tags assigned, but to different revisions, and are then merged later. Solution is, don't do that.
Hope some of that is helpful.

If you care about history of builds then consider creating a named branch just for the build process. In Mercurial all tags from all branches are visible in whole repository.
If you don't care about history bookmarks should do the trick. Build process can set bookmark latest-stable after tests are run and then execute hg push --bookmark latest-stable to push that bookmark to the server.
In either way take you have to take care that you don't run tests on revisions which child has already been tested. Mercurial revsets are very powerful query language and should help.

Why does Mercurial make you pull/update/merge for unrelated files?

For larger teams, having to pull/update/merge then commit each time makes no sense to me, specifically when the files that were changed by other developers have nothing to do with my changeset files.
i.e. I change file1.txt, and someone else changes file10.txt. Why must I merge on my computer before being allowed to push?
It makes pushing a big pain, as you have to constantly pull/update/merge if many developers are commiting.
Also, it makes your changeset look much larger than it was since it shows your merges as seperate commits.

Mercurial makes you do this since its atomic unit isn't a file but a changeset. That is a node containing a group of changes. Each changeset is an individual node in history and represents what that person did. This does result in you having to merge even if no common files where changes (which would be a simple automatic merge). These merge nodes are important since they are part of your repositories history and gives Mercurial more information for future merges with ancestral information.
That said there is an extension you can use that would clean up your history a bit (but won't resolve your issue with needing to pull before you push). It is called the rebase extension, it is shipped with Mercurial but disabled by default. It adds a new arumument to pull that looks like:
hg pull --rebase
This will pull new changes and moves your local changeset linearly above them without having a merge changset. However, I would urge against using this since you do lose information about your repository since you are re-writing its history. Read this post for information about some issues that this may cause.

Well, you could try using rebase, which will avoid the merge commits, but it is not without its own perils. You can also collapse to one step by doing "hg pull --update", rather than separate hg pull; hg update commands.
As for why you must merge on your computer: this is a direct consequence of mercurial being a distributed version control system. There is no central server which can be considered canonical (unless you create one by convention), so there is no other "place" where the merge could occur. You are the only one who can decide how the information in your repo should be combined with the information in the remote repo. The results of these decisions must be recorded, and that is the origin of the merge commit.
Also, in your example the merge would happen without user interaction since there are no conflicts (the same would be true with rebase), so I don't see why that is a problem.

Because having changes in disjunct files does not guarantee that they are independent.
When you pull in changes, even if they are in files that are untouched by your local changes, it can cause your local changes to stop working. E.g. an interface that you access from newly written code could have been changed.
This is why there is always a merge step inbetween, so that a human can review the changes, test for issues, and address them before integrating the changes back into the main repository. This step is very important, because skipping it risks blocking all those 50-100 colleagues (which is very expensive).
I would take Lasse’s advice and push less often. Merging isn’t a big deal if you only need to do it twice or thrice a day. Also maybe create smaller team repositories (or branches) that are merged with the main repository daily by a designated person.

A mercurial merge chose the wrong changes, what is the correct way to fix this?

Changes were made to our .vcproj to fix an issue on the build machine (changeset 1700). Later, a developer merged his changes (changes 1710 through 1715) into the trunk, but the mercurial auto-merge overwrote the changes from 1700. I assume this happened because he chose the wrong branch as the "parent" of the merge (see part 2 of the question).
1) What is the "correct" mercurial way to fix this issue, considering out of all the merged files, only one file was merged incorrectly, and
2) what should the developer have done differently in order to make sure this didn't occur? Are there ways we can enforce the "correct" way?
Edit: I probably wasn't clear enough on what happened. Developer A modified a line in our .vcproj file that removed an option for the compiler. His check-in became changeset 1700. Developer B, working from a previous parent (let's say changeset 1690), made some changes to completely different parts of the project, but he did touch the .vcproj file (just not anywhere near the changes made by Developer A). When Developer B merged his changes (becoming changes 1710 through 1715), the merge process overwrote the changes from 1700.
To fix this, I just re-modified the .vcproj file to include the change again, and checked it in. I just wanted to know why Mercurial thought that it shouldn't keep the changes in 1700, and whether or not there was an "official" way to fix this.
Edit the second: Developer B swears up and down that Mercurial merged the .vcproj file without prompting him for conflict resolution, but it is of course possible that he's just misremembering, in which case this whole exercise is academic.

I will address the 2nd part of you question first...
If there is a conflict, the automated merge tools should force the programmer to decide how the merge happens. But the general assumption is that a conflict will involve two edits to the same set of lines. If somehow a conflict arises because of edits to lines that are not close to each other the automated merge will blithely choose both of the edits and a bug will appear.
The general case of a merge tool always merging properly is very hard to solve, and really can't be with current technology. Here is an example of what I mean from C:
int i; // Someone replaces this with 'short i' in one changeset stating
// that a short is more efficient.
// ... lots of code;
// Someone else replaces all the 65000s with 100000s in another changeset,
// saying that more precision is needed.
for (i = 0; i < 65000; ++i) {
integral_approximation_piece(start + i/65000.0, end + (i + 1) / 65000.0);
}
No merge tool is going to catch this kind of conflict. The tool would have to actually compile the code to see that those two parts of the code have anything to do with eachother, and while that would likely be enough in this case, I can construct an example that would require the code to be run and the results examined to catch the conflict.
This means that what you really ought to do is rigorously test your code after a merge, just like you should after any other change. The vast majority of merges will result in obvious conflicts that a developer will have to resolve (even though that resolution is often fairly obvious), or will merge cleanly. But the very few merges that don't fit either category can't easily be handled in an automated fashion.
This can also be fixed by development practices that encourage locality. For example a coding standard that states "Variables should be declared near where they're used.".
I'm guessing that .vcproj files are particularly prone to this problem since they are not well understood by developers and so if conflicts do appear they will not be sure what to do with them. My guess is that this happened and your developer simply did a revert back to the revision (s)he checked in.
As for part 1...
What to do in this case depends a lot on your development process. You can either strip the merge changeset out and redo it, though that won't work very well if lots of people have already pulled it, and it will work especially poorly if there are lots of changesets that have already been checked in that are based on the merge changeset.
You can also check in a new change that fixes the problem with the merge.
Those are basically your two options.
The tone of your post seems to me to indicate that you may have some politics surrounding this issue in your organization, and people are blaming this error on the frequent merges of Mercurial. So I will point out that any change control system can have this problem. In the case of Subversion, for example, every time a developer does an update while they have outstanding changes in their working directory, they are doing a merge, and this kind of problem can arise with any merge.

In mercurial a merge doesn't have a single parent, it by definition has two and only two parents. When someone is merging they're making two choices:
What two changesets will constitute the two changes
Which of those changesets will be the left-parent and which will be the right-parent
Of those two questions the first is very important, and the second barely matters at all, though it took me a while to come to understand that.
You select the left-parent by using hg update X. That changes the output of hg parents (or in newer versions hg summary) and essentially determines what's in your working directory before the merge.
You select the right-parent by using hg merge Y. That says merge X (the working directory's parent) with changeset Y. As a special case, if there are only two heads in your repository and your parent is already one of them then Y will default to the the other.
I'd have to see your resulting graph to know just what the developer did, but it's possible he didn't update to one head or another before invoking merge, which would have him merging one head with some point back in history.
If your developer picked the right parents for the merge then the left vs. right doesn't much matter -- the only real difference is that when one uses hg diff or hg log -p or some other command that shows the patch for a merge changeset, it's displayed relative to the left-parent. That's, however, mostly a factor in display only. Functionally they're pretty much identical.
Assuming your developer picked the right changesets then what he should have done was test the result of the merge before committing it. Merging is software development, not an annoying VCS side effect, and not testing before committing is the error.
Fixing
To fix this, just re-do the merge correctly. Use hg update to set one parent, use hg merge to pick the other. Make sure your current working directory is correct and then commit. You can get rid of his bad merge using something like hg strip or better, just close down his branch with hg commit --close-branch after updating to it.
Avoiding
You say "mercurial auto-merge", but mercurial doesn't really auto-merge. It does a premerge which is an extremely cautious combination of obvious changes, but it's so careful it won't even merge for you if each merge parent adds code in the same region because it can't know which block of code you'd rather have first.
You can disable this premerge entirely or on a file-by-file basis using the merge tool configuration options:
https://www.mercurial-scm.org/wiki/MergeToolConfiguration?highlight=premerge

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008