Specific Commit Messages for Various Files in Mercurial

Specific Commit Messages for Various Files in Mercurial - mercurial

In My Mercurial Repository, I have 4 files I modified,
When I make a Commit, The commit message applied to all Files I modified, but, there are any form to write a specific commit message for every file I modified in the commit screen??

Ideally, every changeset should contain some specific change, be it something as small as fixing a small bug by correcting a single line in a file, or something as big as changing the signature of a function all over the codebase.
This allows you to do things like transplanting a changeset to a different branch later, which is only easy when your changesets are not polluted by unrelated stuff.
(By the way, this is the main difference between modern DVCS like Mercurial or Git, which track changes, and older systems like SVN, which track revisions.)
If you feel like you need to write separate comments for every file, this might mean you're actually committing several unrelated changes at once, which isn't good practice.
On the other hand, if it's not the case, you can of course write a multi-line message:
User can no longer add the same product twice (issue123)
add_product.py: added server-side validation
scripts.js: added client-side validation

If you are using the command line, you can commit changes to individual files by passing them on the hg commit command line:
> hg st
M file1.cpp
M file1.h
M file2.cpp
M file2.h
> hg commit -m "Some changes" file1.cpp file1.h
> hg st
M file2.cpp
M file2.h
You can do the same thing using TortoiseHg by only "checking" the files you want checked in in the thg commit window - or more precisely, by unchecking those you don't want committed.
Note however that this will create several changesets, one for each commit. If this is not what you want, then I would agree with Helgi.
In a previous version of TortoiseHg (back when it was written using tk, so not that long ago), it was actually possible to select individual "chunks" of changes to a file and commit them separately. However this is not available in the current version, and as far as I am aware is not planned for a while.

Actually, the commit message applies to the changeset, which contains changes to the files you modified.
You could make multi-line messages if you want to address the changes to each file:
new features, fixed bugs, etc.
file1.txt: fixed bug 1234
file2.txt: refactored body of Foobar()
file3.txt: did Rot13 on the entire file, twice
file4.txt: added overload of Bar() to except a second Foo object
I don't recommend doing a changeset for each file, however, even though that is possible.

Related

Mercurial: Automatically tagging a build

In a mercurial set up, I'd like to automatically tag certain builds based on continuous integration scripts. For example, a tag such as branchName-buildId whenever a build of a branch is deployed, or perhaps latest-stable whenever a build passes all integration tests.
However, I'm worried that the straightforward approach of simply calling hg tag will cause problems:
Some tags may be duplicate - i.e. latest-stable. I don't really care which build gets tagged in this situation, but I don't want any conflicts because a script can't resolve those.
Tags cause commits. However, this means that those commits need to be pushed and they need to be robust in the face of concurrent pushes by humans and other scripts. In particular, the automatic push can create additional heads, which is Not Good. But by the time the additional head is detected (at push) the local tag commit has already happened, and even though the new heads are likely trivially mergeable, sometimes tags cause conflicts.
How can I automatically let the CI server tag a build robustly? Here it's more important that the end result is consistent (i.e. that it doesn't mess up the CI server or the repo), and it's less important that tags are reliably applied in the face of duplicates or conflicts (which should be very unlikely anyhow).

I think you're right to be cautious. Robots aren't always the best citizens, and can often do silly things.
What you end up doing depends on what you see the tags being used for. For example, if you only see the CI system using them, then I'd suggest keeping them local. No pull/push/merge issues at all.
Some tags may be duplicate - i.e. latest-stable. I don't really care which build gets tagged in this situation, but I don't want any conflicts because a script can't resolve those.
If a tag is already defined, and you call hg tag again, it will fail unless you force it, but what this does is add a newer, later definition of the same tag, and the latest one wins. On one hand this is good, because the merge is simple, but think about the case when you do:
hg update -r latest-stable
hg update -r latest-stable
hg update -r latest-stable
hg update -r latest-stable
Each time you'll update to the version you'll get a version before the tag was made (as normal), and at that version latest-stable will point to the previous latest-stable. The result is that this sequence of commands will move you back through time.
Hence I'd say it's better either to have unique tags (i.e. stable-2013-02-18) or tag in two commits; One that removes the old tag, and one to add the new one.
hg update -r latest-stable # You're now at the commit that removed the tag.
hg update -r latest-stable # This one will error because tag doesn't exist
Tags cause commits. However, this means that those commits need to be pushed and they need to be robust in the face of concurrent pushes by humans and other scripts. In particular, the automatic push can create additional heads, which is Not Good. But by the time the additional head is detected (at push) the local tag commit has already happened, and even though the new heads are likely trivially mergeable, sometimes tags cause conflicts.
The CI robot should tag; pull; merge (if necessary); push. If the merge fails, don't push, raise an alarm. If the push fails (i.e. there's been more changesets in the time it took to merge), pull and merge again. I'd just make sure your script is very explicit about the revisions it's merging. This process should leave you with no extra heads.
I believe Mercurial treats the .hgtags file differently for merging because it knows about the content, so conflicts should be very rare. Also, tag commits are, in general, easy to merge because all that changes is .hgtags, so a merge from the CI head should never conflict. The only reason it could is because someone else is using the same tag names as the CI server, and if they are doing that then they need to have honey poured on their keyboard so they can do any more damage.
The situation I can see causing problems is if you're doing CI tagging on multiple heads with the same tag names. e.g. Development and release branches both have CI run on them, both have tests-clean tags assigned, but to different revisions, and are then merged later. Solution is, don't do that.
Hope some of that is helpful.

If you care about history of builds then consider creating a named branch just for the build process. In Mercurial all tags from all branches are visible in whole repository.
If you don't care about history bookmarks should do the trick. Build process can set bookmark latest-stable after tests are run and then execute hg push --bookmark latest-stable to push that bookmark to the server.
In either way take you have to take care that you don't run tests on revisions which child has already been tested. Mercurial revsets are very powerful query language and should help.

What to do instead of squashing commits in Mercurial

I've got my IDE set to commit locally every time I save anything. I'd ideally like to keep an uncensored record of my idiot fumblings for the rare occasions they may be useful. But most of the time it makes my history way to detailed.
I'd like to know a good strategy to keep that history but be able to ignore it most of the time. My IDE is running my own script every time I save, so I have control over that.
I'm pretty new to Mercurial, so a basic answer might be all I need here. But what are all the steps I should do when committing, merging, and reporting to be able to mostly ignore these automatic commits, but without actually squashing them? Or am I better off giving up and just squashing?
Related question about how to squash with highly rated comment suggesting it might be better to keep that history
Edit - My point here is that if Mercurial wants to keep all your history (which I agree with), it should let you filter that history to avoid seeing the stuff you might be tempted to squash. I would prefer not to squash, I'm just asking for help in a strategy to (in regular usage, though not quite always) make it look as much as possible like I did squash my history.

You want to keep a detailed history in your repo, but you want to have (and be able to export) an idealized history that only contains "reasonable" revsets, right? I can sympathize.
Solution 1: Use tags to mark interesting points in the history, and learn to ignore all the messy bits between them.
Solution 2: Use two branches and merge. Do your development in branch default, and keep a parallel branch release. (You could call it clean, but in effect you are managing releases). Whenever default is in a stable state that you want to checkpoint, switch to branch release and merge into it the current state of default-- in batches, if you wish. If you never commit anything directly to release, there will never be a merge conflict.
(original branch) --o--o--o--o--o--o--o (default)
\ \ \
r ... ... --r--------r (release)
Result: You can update to any revision of release and expect a functioning state. You can run hg log -r release and you will only see the chosen checkpoints. You can examine the full log to see how everything happened. Drawbacks: Because the release branch depends on default, you can't push it to another repo without bringing default with it. Also hg glog -r release will look weird because of the repeated merges.
Solution 3: Use named branches as above, but use the rebase extension instead of merging. It has an option to copy, rather than move outright, the rebased changesets; and it has an option --collapse that will convert a set of revisions into a single one. Whenever you have a set of revisions r1:tip you want to finalize, copy them from default to release as follows:
hg rebase --source r1 --dest release --keep --collapse
This pushes ONE revision at the head of release that is equivalent to the entire changeset from r1 to the head of default. The --keep option makes it a copy, not a destructive rewrite. The advantage is that the release branch looks just as you wanted: nice and clean, and you can push it without dragging the default branch with it. The disadvantage is that you cannot relate its stages to the revisions in default, so I'd recommend method 2 unless you really have to hide the intermediate revisions. (Also: it's not as easy to squash your history in multiple batches, since rebase will move/copy all descendants of the "source" revision.)
All of these require you to do some extra work. This is inevitable, since mercurial has no way of knowing which revsets you'd like to squash.

it should let you filter that history to avoid seeing the stuff you might be tempted to squash
Mercurial has the tools for this. If you just don't want see (in hg log, I suppose) - filter these changesets with revsets:
hg log -r "not desc('autosave')"
Or if you use TortoiseHg, just go View -> Filter Toolbar, and type in "not desc('autosave')" in the toolbar. Voila, your autosave entries are hidden from the main list.

If you actually do want to keep all the tiny changes from every Ctrl-S in the repo history and only have log show the subset of the important ones, you could always tag the "important" changesets and then alias log to log -r tagged(). Or you could use the same principle with some other revset descriptor, such as including the text 'autosave' in the auto-committed messages and using log -r keyword(autosave), which would show you all non-autosaved commits.
To accomplish your goal, at least as I'd approach it, I'd use the mq extension and auto-commit the patch queue repository on every save. Then when you've finished your "idiot fumblings" you can hg qfinish the patch as a single changeset that can be pushed. You should (as always!) keep the changes centered around a single concept or step (e.g. "fixing the save button"), but this will capture all the little steps it took to get you there.
You'd need to
hg qinit --mq once to initialze the patch queue repo (fyi: stored at \.hg\patches\)
hg qnew fixing-the-save-btn creates a patch
then every time you save in your IDE
hg qrefresh to update the patch
hg commit --mq to make the small changeset in the patch queue repo
and when you are done
hg qfinish fixing-the-save-btn converts the patch into a changeset to be pushed
This keeps your fumblings local to your repo complete with what was changed every time you saved, but only pushes a changeset when it is complete. You could also qpop or qpush to change which item you were working on.
If you were to try the squash method, you'd lose the fumbling history when you squashed the changesets down. Either that or you'd be stuck trying to migrate work to/from the 'real' repository, which, I can tell you from experience, you don't want to do. :)

I would suggest you to use branches. When you start a new feature, you create a new branch. You can commit as many and often as you like within that branch. When you are done, you merge the feature branch into your trunk. In this way, you basically separate the history into two categories: one in fine-grain (history in feature branches), and the other in coarse-grain (history in the trunk). You can easily look at either one of them using the command: hg log --branch <branch-name>.

Why does Mercurial make you pull/update/merge for unrelated files?

For larger teams, having to pull/update/merge then commit each time makes no sense to me, specifically when the files that were changed by other developers have nothing to do with my changeset files.
i.e. I change file1.txt, and someone else changes file10.txt. Why must I merge on my computer before being allowed to push?
It makes pushing a big pain, as you have to constantly pull/update/merge if many developers are commiting.
Also, it makes your changeset look much larger than it was since it shows your merges as seperate commits.

Mercurial makes you do this since its atomic unit isn't a file but a changeset. That is a node containing a group of changes. Each changeset is an individual node in history and represents what that person did. This does result in you having to merge even if no common files where changes (which would be a simple automatic merge). These merge nodes are important since they are part of your repositories history and gives Mercurial more information for future merges with ancestral information.
That said there is an extension you can use that would clean up your history a bit (but won't resolve your issue with needing to pull before you push). It is called the rebase extension, it is shipped with Mercurial but disabled by default. It adds a new arumument to pull that looks like:
hg pull --rebase
This will pull new changes and moves your local changeset linearly above them without having a merge changset. However, I would urge against using this since you do lose information about your repository since you are re-writing its history. Read this post for information about some issues that this may cause.

Well, you could try using rebase, which will avoid the merge commits, but it is not without its own perils. You can also collapse to one step by doing "hg pull --update", rather than separate hg pull; hg update commands.
As for why you must merge on your computer: this is a direct consequence of mercurial being a distributed version control system. There is no central server which can be considered canonical (unless you create one by convention), so there is no other "place" where the merge could occur. You are the only one who can decide how the information in your repo should be combined with the information in the remote repo. The results of these decisions must be recorded, and that is the origin of the merge commit.
Also, in your example the merge would happen without user interaction since there are no conflicts (the same would be true with rebase), so I don't see why that is a problem.

Because having changes in disjunct files does not guarantee that they are independent.
When you pull in changes, even if they are in files that are untouched by your local changes, it can cause your local changes to stop working. E.g. an interface that you access from newly written code could have been changed.
This is why there is always a merge step inbetween, so that a human can review the changes, test for issues, and address them before integrating the changes back into the main repository. This step is very important, because skipping it risks blocking all those 50-100 colleagues (which is very expensive).
I would take Lasse’s advice and push less often. Merging isn’t a big deal if you only need to do it twice or thrice a day. Also maybe create smaller team repositories (or branches) that are merged with the main repository daily by a designated person.

Condensing Mercurial revision history

We have 2,700+ revisions and it takes a good 30-45 seconds to load Mercurial when doing a merge, push or anything else with TortoiseHg. I'm wondering if there's a way other than straight up creating a new repository to clean up the revision history. Say, cut off files under revision 2,400 or so.

Not an answer to your question, but:
Maybe reducing "log batch size" to 100 (default is 500) in the settings helps.
Our 2300+ rev repo loads in 2-3 secs (off my 15k rpm SAS-disk, but never mind that), so I don't think your problem is many revs, really. There are much bigger repos out there. :)
Note that both Mercurial core and TortoiseHg developers are keen on finding performance bugs, so it might be worthwhile to ask on the mail-lists for assistance.

You can use the histedit extension to compress several changesets into one. Executing the histedit command on a range of revisions will spawn a text document that looks like this (from the histedit documentation):
pick c561b4e977df Add beta
pick 030b686bedc4 Add gamma
pick 7c2fd3b9020c Add delta
Edit history between c561b4e977df and 7c2fd3b9020c
Commands:
p, pick = use commit
e, edit = use commit, but stop for amending
f, fold = use commit, but fold into previous commit
d, drop = remove commit from history
Changing pick to fold for a certain changeset in the list above will fold it into the previous changeset. It will give you an opportunity to resolve failed merges and enter a new commit message as well.
WARNING:
Using histedit will modify the repository history, including hash IDs, which will cause problems unless you re-start each developer with a new repository clone after the changes have been made. Also, you would probably need to limit your histedit-ing to changesets with a single parent (ie: non-merge changesets).

Merging changes to a workspace with uncommitted changes

We've just recently switched over from SVN to Mercurial, but now we are running into problems with our workflow. Example:
I have my local clone of the repository which I work on. I'm making some highly experimental changes to our code base, something that I don't want to commit before I'm sure it works the way it is supposed to, I don't want to commit it even locally. Now, simultaneously, my co-worker has made some significant improvements/bug fixes which I need. He pushes his commits to our main repository. The question is, how can I merge his changes to my workspace without the requirement that I have to commit all my changes, since I need his changes to test my own code?
A more day-to-day problem we have with the exact same workflow is where we have a couple of configuration files which are in the repository. Each developer makes a couple of small environment specific changes to the configuration files, but do not commit the changes. These couple of uncommitted files hinders us from making any merges to our workspace, just like with the example above. Ideally, the configuration files probably shouldn't be in the repository, unfortunately, that's just how it has to be for here unnamed reasons.

If you don't want to clone, you can do it the following way.
hg diff > mylocalchanges.txt
hg revert -a
# Do your merge here, once you are done, import back your local mods
hg import --no-commit mylocalchanges.txt

There are two operations, as you've discovered, that makes changes from one person available to someone else (or many, on either side.)
There's pulling, which takes changes from some other clone of the repository and puts them into your clone.
There's pushing, which takes changes from your repository and puts them into another clone.
In your case, your coworker has pushed his changes into what I assume is your central master of the repository.
After he has done this, you can pull the latest changes down into your repository, and merge them into your branch. This will incorporate any bugfixes or changes your coworker did into your experimental code.
This gives you the freedom of staying current on other coworkers development in your project, and not having to release your experimental code until it is ready (or even at all.)
So, as long as you stay away from the Push command, you're safe.
Of course, this also assumes nobody is pulling directly from your clone of the repository, if they do that, then of course they will get your experimental changes, but it doesn't sound like you've set it up this way (and it is highly unlikely as well.)
As for the configuration files, the typical way to do this is that you only commit a master file template into the repository, with a different name (ie. an extra extension .template or similar), and then place the name of the real configuration file into the ignore filter.
Each developer then has to make his or her own copy of the template, rename it, and change it in any way they want, without the risk of committing database connection strings, passwords, or local paths, to the repository.
If necessary, provide a script that will help the developer make the real configuration file if it is long and complex.

Regarding your experimental changes, you should commit them. Often.
Simply you commit them in a clone you don't push. You only pull to merge whatever updates you need from other repos.
As for config files, don't commit them.
Commit template files, and script able to generate complete config files from the template.
That way, developers will only modify "private" (i.e. not committed) config files with their own private values.

If you know your uncommitted changes will not collide with the merge commit that you are creating - then you can do the following...
1) Shelve the uncommitted changes
2) Do the pull and merge
3) Unshelve the uncommitted changes
Shelf effectively stores your uncommitted changes away as into diff (relative to your last commit) then rolls back those files in your local workspace. Then un-shelving then applies that diff, bringing back your uncommitted changes.
Tools such as TortoiseHg have shelf built in.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008