My desire is to keep my fork up to date with the parent, and ideally record the parent's individual commits + messages, so that I have a full history in the fork, of what changed in the parent.
So essentially, when you start a fork and you see the parent's entire commit history as the "starting point" for your fork, I would like to keep the parent's commit history ongoing in the fork, with my fork-specific commits interspersed (with conflict resolution as needed).
Is this possible?
Or if that's not possible, is there a way to script an automatic pull & merge of parent changes into the child fork, while coalescing the parent's commit messages into the new merge's commit message? (so all individual messages from imported commits, would be combined into one massive merge message)
If I am not mistaken, mercurial and git share a lot in terms of data structures used for history and, in git at least, merging does just what you ask for. Just make sure that your merge commit (or changeset, as hg calls it) actually links to both parent commit (the mostrecent parent project commit and your fork's most recent commit), and that's it.
After merging, check upon the tree with, for example, hgk. It should look like:
http://wiki.genunix.org/wiki/images/3/3f/Screenshot-hgk.png
Related
I've got my IDE set to commit locally every time I save anything. I'd ideally like to keep an uncensored record of my idiot fumblings for the rare occasions they may be useful. But most of the time it makes my history way to detailed.
I'd like to know a good strategy to keep that history but be able to ignore it most of the time. My IDE is running my own script every time I save, so I have control over that.
I'm pretty new to Mercurial, so a basic answer might be all I need here. But what are all the steps I should do when committing, merging, and reporting to be able to mostly ignore these automatic commits, but without actually squashing them? Or am I better off giving up and just squashing?
Related question about how to squash with highly rated comment suggesting it might be better to keep that history
Edit - My point here is that if Mercurial wants to keep all your history (which I agree with), it should let you filter that history to avoid seeing the stuff you might be tempted to squash. I would prefer not to squash, I'm just asking for help in a strategy to (in regular usage, though not quite always) make it look as much as possible like I did squash my history.
You want to keep a detailed history in your repo, but you want to have (and be able to export) an idealized history that only contains "reasonable" revsets, right? I can sympathize.
Solution 1: Use tags to mark interesting points in the history, and learn to ignore all the messy bits between them.
Solution 2: Use two branches and merge. Do your development in branch default, and keep a parallel branch release. (You could call it clean, but in effect you are managing releases). Whenever default is in a stable state that you want to checkpoint, switch to branch release and merge into it the current state of default-- in batches, if you wish. If you never commit anything directly to release, there will never be a merge conflict.
(original branch) --o--o--o--o--o--o--o (default)
\ \ \
r ... ... --r--------r (release)
Result: You can update to any revision of release and expect a functioning state. You can run hg log -r release and you will only see the chosen checkpoints. You can examine the full log to see how everything happened. Drawbacks: Because the release branch depends on default, you can't push it to another repo without bringing default with it. Also hg glog -r release will look weird because of the repeated merges.
Solution 3: Use named branches as above, but use the rebase extension instead of merging. It has an option to copy, rather than move outright, the rebased changesets; and it has an option --collapse that will convert a set of revisions into a single one. Whenever you have a set of revisions r1:tip you want to finalize, copy them from default to release as follows:
hg rebase --source r1 --dest release --keep --collapse
This pushes ONE revision at the head of release that is equivalent to the entire changeset from r1 to the head of default. The --keep option makes it a copy, not a destructive rewrite. The advantage is that the release branch looks just as you wanted: nice and clean, and you can push it without dragging the default branch with it. The disadvantage is that you cannot relate its stages to the revisions in default, so I'd recommend method 2 unless you really have to hide the intermediate revisions. (Also: it's not as easy to squash your history in multiple batches, since rebase will move/copy all descendants of the "source" revision.)
All of these require you to do some extra work. This is inevitable, since mercurial has no way of knowing which revsets you'd like to squash.
it should let you filter that history to avoid seeing the stuff you might be tempted to squash
Mercurial has the tools for this. If you just don't want see (in hg log, I suppose) - filter these changesets with revsets:
hg log -r "not desc('autosave')"
Or if you use TortoiseHg, just go View -> Filter Toolbar, and type in "not desc('autosave')" in the toolbar. Voila, your autosave entries are hidden from the main list.
If you actually do want to keep all the tiny changes from every Ctrl-S in the repo history and only have log show the subset of the important ones, you could always tag the "important" changesets and then alias log to log -r tagged(). Or you could use the same principle with some other revset descriptor, such as including the text 'autosave' in the auto-committed messages and using log -r keyword(autosave), which would show you all non-autosaved commits.
To accomplish your goal, at least as I'd approach it, I'd use the mq extension and auto-commit the patch queue repository on every save. Then when you've finished your "idiot fumblings" you can hg qfinish the patch as a single changeset that can be pushed. You should (as always!) keep the changes centered around a single concept or step (e.g. "fixing the save button"), but this will capture all the little steps it took to get you there.
You'd need to
hg qinit --mq once to initialze the patch queue repo (fyi: stored at \.hg\patches\)
hg qnew fixing-the-save-btn creates a patch
then every time you save in your IDE
hg qrefresh to update the patch
hg commit --mq to make the small changeset in the patch queue repo
and when you are done
hg qfinish fixing-the-save-btn converts the patch into a changeset to be pushed
This keeps your fumblings local to your repo complete with what was changed every time you saved, but only pushes a changeset when it is complete. You could also qpop or qpush to change which item you were working on.
If you were to try the squash method, you'd lose the fumbling history when you squashed the changesets down. Either that or you'd be stuck trying to migrate work to/from the 'real' repository, which, I can tell you from experience, you don't want to do. :)
I would suggest you to use branches. When you start a new feature, you create a new branch. You can commit as many and often as you like within that branch. When you are done, you merge the feature branch into your trunk. In this way, you basically separate the history into two categories: one in fine-grain (history in feature branches), and the other in coarse-grain (history in the trunk). You can easily look at either one of them using the command: hg log --branch <branch-name>.
For larger teams, having to pull/update/merge then commit each time makes no sense to me, specifically when the files that were changed by other developers have nothing to do with my changeset files.
i.e. I change file1.txt, and someone else changes file10.txt. Why must I merge on my computer before being allowed to push?
It makes pushing a big pain, as you have to constantly pull/update/merge if many developers are commiting.
Also, it makes your changeset look much larger than it was since it shows your merges as seperate commits.
Mercurial makes you do this since its atomic unit isn't a file but a changeset. That is a node containing a group of changes. Each changeset is an individual node in history and represents what that person did. This does result in you having to merge even if no common files where changes (which would be a simple automatic merge). These merge nodes are important since they are part of your repositories history and gives Mercurial more information for future merges with ancestral information.
That said there is an extension you can use that would clean up your history a bit (but won't resolve your issue with needing to pull before you push). It is called the rebase extension, it is shipped with Mercurial but disabled by default. It adds a new arumument to pull that looks like:
hg pull --rebase
This will pull new changes and moves your local changeset linearly above them without having a merge changset. However, I would urge against using this since you do lose information about your repository since you are re-writing its history. Read this post for information about some issues that this may cause.
Well, you could try using rebase, which will avoid the merge commits, but it is not without its own perils. You can also collapse to one step by doing "hg pull --update", rather than separate hg pull; hg update commands.
As for why you must merge on your computer: this is a direct consequence of mercurial being a distributed version control system. There is no central server which can be considered canonical (unless you create one by convention), so there is no other "place" where the merge could occur. You are the only one who can decide how the information in your repo should be combined with the information in the remote repo. The results of these decisions must be recorded, and that is the origin of the merge commit.
Also, in your example the merge would happen without user interaction since there are no conflicts (the same would be true with rebase), so I don't see why that is a problem.
Because having changes in disjunct files does not guarantee that they are independent.
When you pull in changes, even if they are in files that are untouched by your local changes, it can cause your local changes to stop working. E.g. an interface that you access from newly written code could have been changed.
This is why there is always a merge step inbetween, so that a human can review the changes, test for issues, and address them before integrating the changes back into the main repository. This step is very important, because skipping it risks blocking all those 50-100 colleagues (which is very expensive).
I would take Lasse’s advice and push less often. Merging isn’t a big deal if you only need to do it twice or thrice a day. Also maybe create smaller team repositories (or branches) that are merged with the main repository daily by a designated person.
I am coming from CVS background.
I try to perform branch by cloning.
The current default tree looks like this from hello project.
I try to clone a project out from 'hello' to 'hello-branch-by-clone'.
I did modification on 'hello-branch-by-clone' and commit.
I did not do any modification on 'hello'.
I perform push from 'hello-branch-by-clone' to 'hello'.
I expect to see a branch but I didn't.
This time, I try another way around.
I did modification on 'hello-branch-by-clone' and commit.
I did modification on 'hello' and commit.
I need to pull from 'hello' to 'hello-branch-by-clone', and merge.
I perform push from 'hello-branch-by-clone' to 'hello'.
This time, then only I can see the branch
By applying cloning technique, is there any way I can have a branch view, without having explicitly modify the default repository (hello)
There's no fork because no two changesets share a parent.
A cloned repository isn't special in any way. It's identical to the original. Commits to it are identical to commits in the original repo. They aren't notionally tagged as being on a branch. A clone is just a nice way of having another work area that you can do some work (including commits) without effecting the original.
A fork occurs when two or more commits have the same parent. Often this happens when using clones, but it may not. If there's only one changeset with the same parent, there is no fork.
After your first sequence you've introduced just one changeset (4) which is which has the old tip (3) as it's parent, so it's still a straight line. Only when you introduce a second changeset parented by (3) will you see a fork.
Now remember, even though you 'push'ed the changeset back, and the original "Hello" repo contains all 4 changesets, it's working directory is still pointing to changeset (3). It will stay that way until you run 'hg update' inside it. This means that if you were to make a commit in "Hello" it will be based upon (3), and then a fork will appear. It doesn't matter when this commit is made.
This is what you did in your second sequence.
Hope that helps.
I've tried to use the term 'fork' in this, because 'branch' has lots of meanings, including the 'hg branch' command which does some slightly different things.
Sometimes when I am working on a code change, I need to make a corresponding change to the shared library code in my repository, which is itself a subrepository. When I want to commit the changes, I do so in the parent repository and Mercurial takes care of doing the coresponding commit in the subrepo.
However, I recently started using MQ to better track my own change history and give myself some more freedom to experiment and do large-scale refactoring work more safely. I have MQ enabled on both the parent and subrepo.
In the scenario describe above, am I correct in assuming that if I 'commit' to a new or existing MQ patch, then the subrepo gets ignored? This is what seems to happen in my testing. Does this mean I need to manually manage the patch queue in the subrepo? This gets unwieldy very quickly.
If this doesn't work, that's fine - I can adjust my workflow and avoid MQ when I have cross-repo work to do - but I wonder if I'm missing something or if others have a solution to this scenario they could share.
Update: Seems like this may be 'unsupported' at this point according to this thread: https://www.mercurial-scm.org/bts/issue2499
I tried the following:
Committed the subrepo change manually.
Updated the .hgsubstate file in the parent repository manually with the hash from the subrepo commit.
Tried to qrefresh the parent repository.
The idea was that I could get the commit from the parent repository to always sync with the right subrepo version, even if I had to do it manually. Unfortunately Mercurial seems to want to protect me here (as it should!) and pits out the following when I try to qrefresh:
warning: not refreshing .hgsubstate
In order to 'fix' things I do an empty commit after the hgfinish on the parent repository to get things synced back up. But... that seems silly and makes the code history harder to grok.
Oh well, guess I'll go back to the stone age and stop using MQ in my workflow when there are sub-repositories involved.
I have a repo called MySharedLib, and another repo called MyProject. MySharedLib is included in many different repos by force-pulling (like a Jedi), and NOT using subrepos.
If you clone MyProject, you are left with the following structure:
/MyProject
MySharedLib
OtherStuff
Files...
MySharedLib is not a subrepo. Pulling in changes from MySharedLib is as easy as running:
hg pull -f path/to/MySharedLib.
But if changes are made to /MyProject/MySharedLib, what's the most straightforward/standard way to push ONLY those changes to the MySharedLib repo?
MQ? hg export? hg diff? hg transplant? My understanding is that almost all of these options work (some together, some apart), but I'd like some direction.
And then what happens if a dev makes a commit that includes other files than those within MySharedLib? Obviously this is something to avoid, but I'm just curious.
Here are the constraints that govern what you can push:
you can only push whole changesets -- if you commit some changes together it's all or none on the pushing front, you can't break up a changeset after you commit it
you can't push a changeset without pushing all of it ancestor changesets to
So once you've committed a linear history like this:
[0]---[1]----[2]-----[3]
you can push changesets zero and one without pushing two and three, but if you want to push two you also have to push zero and one.
And if changeset one contains changes to both /MyProject/OtherStuff and /MyProject/MySharedLib/ you have to push those together.
Your only flexibility comes before you commit where you can control:
what goes into a changeset
what the parents of a changeset are (which also have to be pushed with it)
So if your history currently looks like this:
[0]---[1]
and hg status is showing something like this:
M MyProject/OtherStuff/file1
M MyProject/OtherStuff/file2
M MyProject/MySharedLib/file3
M MyProject/MySharedLib/file4
Then you want to make a new changeset that has only the changes for MySharedLib that you want to push:
hg commit --include MyProject/MySharedLib
making your history look like:
[0]----[1]-----[2]
and then, before you commit the changes in OtherStuff you don't want to push you do a hg update to change the current parent revision so that your new changeset will have a parent of one instead of two:
hg update 1
Now when you do:
hg commit
your new changeset, three, will have only the non-MySharedLib changes and a parent of one, so you history will look like this:
[0]-----[1]--------[2]
\
--------[3]
Since two and three aren't one another's ancestors you can push either one without pushing the other.
That said, omnifarious is right: your usage isn't just weird, it's out and out wrong. You should look at a subrepo setup. It almost certainly achieves your goals better than what you or I have just described.
Wow, what a bizarre setup. How do you prevent changes to MySharedLib from being pushed to MyProject? How do the files from MySharedLib ever appear unless you do a merge after pulling them? Once you do the merge, the repos are joined and you will need to make advanced used of hg convert (as described in this question about splitting repositories) to pull them apart again.
Do not do this. Use subrepos. This problem is what they exist to solve.