I'm trying to write some automation processes related to a mercurial repository. One of the things I'd like to be able to do is identify the commits that created new branches. I can't find any easy way to do this.
I'd like to be able to get them all in one go rather than having to find a list of branch names and then for each branch name find the earliest commit.
Preferably I'd also like to have the option to only do it for open branches but that is something I can work around in other ways if needed.
First iteration (doesn't work yet as requested: return "more than needed" changesets, handle only named branches without anonymous branching /fully legal in HG/) of
revset for log:
children(branchpoint()) - merge() - branch(default)
For such rather usual tree I still can't write last part of revset (eliminate revisions under "?") and think about revset() in template for filtering "bad childs" on output stage
Second iteration, easy as "one, two, three"
I was very stupid and dumb, when I twisted revset-based solution. I saw it, because
hg log -T"{ifeq(p1.branch, branch,'','{myrev}: {sob}')}"
(where the last part of ifeq() is just my custom string with some [templatealias] for fun) do the trick
>hg log -T"{ifeq(p1.branch, branch,'','{myrev}: {sob}')}"
r4: New branch detected - two
r2: New branch detected - one
and is the real candidate to alias.
Testing on real repositories would be welcomed
Related
Hold on, this one's going to be rough. I'm attempting to clean up a Mercurial repo that I can only describe as pathological, and it appears the simplest 'fix' is to swap one directory between two branches. I'm not sure if I can adequately explain the situation, but I'll give it a shot.
We have a tool that generates raw source from a large set of input. (Hundreds or thousands of input files, similar number of output files.) This output is then hand-tweaked and optimized as needed, before being declared ready for deployment. The tool is run inside a directory that offers the configuration files, etc.
Something like:
proj/.hg
/input
/output
/config-env
Now, what has happened is that a branch was used for the tweaking, instead of the (to me) logical choice of a shared mq approach. (The team wasn't aware of mq at the time, are mostly CVS trained, but to their credit, they are working to improve the workflow.)
Here's where it gets grody.
The named branch is the raw output branch, and the default branch is the hand tweaked branch.
The output of the tool was continually created, then hg update raw, then hg merge, to try and merge tool refinements to the output into the raw branch, then that was merged back into the 'default' branch to incorporate both the refinements and prior hand patches, using the merge failure list as the pick list for what needed to be further hand edited.
ie, there's this crazy cycle between the two branches that has made development a minor nightmare of fragility and errors. That's what I'm trying to clean out. What I want is for the default branch to be the raw output, and a new 'deploy' branch that contains the hand patches, with merges going only one way, from default to deploy. Still not optimal, IMO, but until I can convince the powers that be that mq is the way to go, it's a rather large improvement.
But. There's more. The config-env and input directories were altered as time went on, but the changes were not kept clean and in sync between the two branches. The changes to the input and config-env are on the default branch, but the matching raw output is on the raw branch, with stale input and config-env that don't really belong there. I've inquired about removing the stale files as unnecessary, but there are some arguing that we need to retain them. For that reason, I want to swap only the output directory between the two branches.
I considered something like hg convert --branchmap swap.txt --filemap only-output.txt --datesort but then realized I would get only the output directory in the resulting repo. I might be able to literally strip out the output directories from the two branches into their own repos with the branch names swapped, then merge the two repos together (no-output, output-with-swapped-branches), but I figured that perhaps someone on here would have a better idea.
Of course, what I'm aiming for is to have one branch with an mq patch list to store the hand patches, and that patch list be part of the shared repo, but babysteps are in order.
If push comes to shove, I think we can eliminate most of the history of the output directories on both branches, but we do require certain tagged releases to be retained on both branches.
I've read similar posts of SO, the official Hg guide, many articles and guides, and it's still unclear to me what the best Hg workflow is for developing by feature. Maybe some of the articles on the web are years old and don't include the latest features from Hg. Obviously there's also a lot of options in how to approach it.
I'm a solo developer working on a project where a request for a fix or feature will be submitted to me as a task, like "Task #546 - Change whatever". Some of these tasks take a few days, and some tasks are open for months and there's often up to a dozen going at one time. A task is shipped to the final site after it's approved by the requestor.
The Hg guide seems to recommend having a clone per feature. But having a dozen full copies of the site on my drive seems... wasteful? I'm up for trying it, but I've seen other suggestions that make more sense. Do people really have a dozen copies of each site on their dev machine at a time?
Name branches at first sound like what I'd want, where's I'd name a branch "task 546" work on it, then merge it back in when it ships. I see a lot of discussion about the permanence of the names and having so many branches (though they can be closed). Some people seem to care about that and some don't. I don't know Hg enough to know if I care or not, and what the downsides really mean.
Finally, bookmarks seem to be popular with the more recent articles and it would seem that the best way to use them would be to set a bookmark like "task 546" then when you merge it back into the main branch using a commit message that has the task number in it to keep a reference to what was being done in the work. I know you can delete bookmarks, but it's unclear if I'd need to do this after the final merge.
So my thought for a combined approach is to have:
one repo
three named branches:
"default" which holds the released version of the site
"dev" on which I do feature development
"test" which would hold all of the tasks being reviewed by the client
on the "dev" branch I would use bookmarks for each of the tasks that I'm working on, so I'd have a head for each task
My workflow for a task/feature would be to:
Update to the main line of the "dev" named branch
Start a new branch using a bookmark for the task "task #123"
Commit changes until I'm ready for the client to review
Merge "task #123" into the "test" branch
Deploy "test" to the test server
Repeat the commit, merge, deploy until ready for production
When approved, merge with the main line of the "dev" branch with a commit message that includes the task name
Merge "dev" into the "default" branch.
Deploy the "default" branch to the live server
Merge "default" into the open feature branches
Thoughts? Would I be better off just having a clone for each feature, and a "live" and "test" repo that I push to?
Edit: I see from some links that I should be doing the development off of "default" so my first change to my listed process would be to use a name "production" branch instead of a named "dev" branch.
Bookmarks-style of branching (Git-like "branches") works poorly in, at least, two common cases
Cross-tasks merges in the process of development
Time-back machine, when you'll want to see "the whole history of changes for task#123" (you can do it visually and, with some grimaces and jumping, using revsets)
While using named branches haven't such problems and, btw, workflow with named branches (and only default branch as aggregation point) will be less complex and more logical way
Default contain only mergesets from task-branches, head of default is always "stable version"
Heads of named branches are WIP; branches, merged to default - finished (and accepted by customer - see below) work
Default, merged to task-branch (after development of task, before merging task-branch to default) is equivalent of your "test": without affecting mainline you can test final state of feature, integrated into your stable app, show results to customer
Accepted work added to stable mainline by merging named branch to default
History (full history) of changes for every task in the past can be easy restored by using single, easy, short, memorable revset for log: -r "branch(TASK-ID)"
I like it. +1. This is the way I'd do it.
I have a "main" repository, I clone it and make some changes in the clone. In the meantime, there are other changes in "main", so I pull them and merge them in my clone. I make more changes in the clone, and merge any other new changes from "main". This gives me this graph:
When I finish my work in the clone, I push to the main repository and now the graph in "main" looks like this:
I know they're topologically the same, but to me the first one is clearer (this one is a very simple case, but things could get more complicated).
Is there any way to prevent this? I've found this question about reordering the graph after the fact, but I was thinking maybe there's a problem in my workflow or something I could change to prevent it.
The problem is that the graph is sorted by the revision number, not by the revision date. This is effectively sorting by the date/time that the revisions appeared in the current repository. There is an outstanding issue on the thg project to allow sorting of the list by revision date but one of the developers said that this change would need to involve hiding the graph as he thinks that the re-write of the grapher would be too complicated for too little gain (the issue is here).
There is no workflow involving merge that I know of to fix it because the revisions will never be in the same order on different repositories if work is carried out on more than one repo.
One way to neaten up the tree would be to use rebase instead of merge after pulling your changes. This would result in a single branch with no merges as it re-writes history to make it appear as though your draft revisions were implemented after the changes that you just pulled. If you want to read up on rebase, that info is here.
I'm an ex SVN user trying to work out the best way to do branched development in hg. My project is fairly new currently has no branches. A friend of mine suggested that making a local clone of the repos. then working in that was better than using a named branch.
So if I use this model, would the workflow be:
[say original project has been cloned to be in c:\projects\sk\tracker]
hg clone https:[url of repos] tracker_featurex [to be issued from c:\projects\sk]
change to subdir tracker_featurex
checkin and push as per normal
[optional, how do I pull changes from the main repos. into this one?]
[final step, how do I get changes from this clone back into the main trunk?]
I need help on whether this workflow is correct and what the exact commands would be for the two steps in the [] braces.
Thanks a great deal to anyone who can help,
Fred
I would recommend you take a look at Steve Losh's post on branching in Mercurial: http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/
He goes over various types of branches (clones, bookmarks, named branches, anonymous branches) as well as the commands you would run for each. There are pros and cons to all of them. Local clones are ok if you are the only developer but they are not as useful in a workflow where more than one developer needs to work on a branch. The claim that clones are universally better than named branches is a myth. You should find a branching model that fits your workflow.
Update:
If you do want to do local clones you can move you changes using hg push from the new workspace (Assuming you have a Projects folder and a repo named test):
Projects> hg clone test test-new-feature
Projects> cd test-new-feature
Projects/test-new-feature> <do some work>
Projects/test-new-feature> hg commit -m "Work is done."
Projects/test> <Might need a pull/merge here>
Projects/test-new-feature> hg push
If there are changes in the test repo you need to pull/merge them before pushing.
You can also hg pull from the original workspace:
Projects> hg clone test test-new-feature
Projects> cd test-new-feature
Projects/test-new-feature> <do some work>
Projects/test-new-feature> hg commit -m "Work is done."
Projects/test-new-feature> cd ../test
Projects/test> hg pull ../test-new-feature
This might create multiple heads in the test repo and you would need to merge/commit.
Projects/test> hg merge
Projects/test> hg commit -m "Merged in new-feature."
Either are good options. I might recommend pulling rather than pushing. The main difference to me is the location of the merge step. I think pulling from the feature repo makes the history a little more readable.
I am fledgling to Hg, so take what I say with a word of caution :-)
I love having named branches, but use them very judiciously! There are downsides to the approach I use below, but it works well for my current environment which is a small shop. I don't mind preserving history forever and I'm not concerned with reducing the number of commits (but Mq/record/etc can address this latter bit).
This is how I use branches in the code I work on:
Default branch.
This is built on the build server.
This should only have one head.
This should always compile.
This should always be the "best effort" at completing bugs/features.
"Workbench" branch.
This can have multiple heads.
Anonymous branches are encouraged. Shared bookmarks used to "name" active anonymous branches.
The state should be almost always compilable, but it is not a requirement.
Represents "work in progress".
Okay, so this is what my process might look like this: (I've excluded pull/push/merge-theirs steps).
I get a bug report in.
I switch to "workbench" tip (or whatever revision is appropriate).
I fix the bug, possibly committing several times. (I really should learn to use queues or record, etc.)
(If I am interrupted in the above process, e.g. have to work on a different bug, or am otherwise side-tracked I will create a new head above where #2, or as appropriate. I may give the current anonymous branch tip a name with a bookmark, or I may not.)
Once complete, I merge the relevant branch/changes into "default" and hopefully the build server still loves me :-)
I think the best thing to do is forget about how branches in SVN worked. They are not liked named branches at all and anyone who says otherwise is latching onto the fact they both have "names" and not much more. Every branch in Hg is part of a "named branch" (that is, has a name associated with it, be it "default" or "workbench" or otherwise). But it doesn't matter, except for organization: a branch is a branch and it doesn't matter if it's referring to the "tip" of an anonymous branch or the tip of the only head (really an anonymous branch itself) in a named branch.
Try a few things, use what works best :)
making a local clone of the repos. then working in that was better than using a named branch.
Overly dramatic and ambitious statement in common. When you clone-per-feature, you have only one branch (named branch) per repo, but nothing more (practically, briefly speaking).
When feature is finished, you have to "push to parent"|"pull from clone" in order to return changes back. At this stage, if some work was done in parent repo after clone, anonymous branch will appear (+1 head) and merge is a must (same as for work in named brach in one repo), but, it named brach can tells something fast later (you use good names, isn't it?), anonymous branch tells almost nothing without additional tricks (bookmarks, f.e). Part of my repo below as example of work in clone with intermediate pulls and must-merges after pulls/ (sorry, russian commit-messages) and even I can't recall now, why I had repo cloned for editorials - maybe I just play with Clones-Workflow
I currently use SVN for a number of things that aren't exactly code, for instance xml files, report templates, miscellaneous files, etc. I have several non-developers who are comfortable using TortoiseSVN for this. They typically work as follows:
Person A - does an SVN Update on the folder of interest to them. Or perhaps just on a single file.
Person A - edits whichever file(s) they're working on. Perhaps add or remove files.
Person B - someone else is probably working on different files at this point
Person A - does an SVN Commit to save their changes to the repository.
Very occasionally they'll hit conflicts where more than one person has edited a file. Almost always this is just because they forgot step #1. Because they're always working on separate files, there are (almost) never real conflicts. As long as they do step #1 first everything works fine.
I'd like to move to Mercurial, however something holding me back is the prospect of having do 'merge' all the time, because Mercurial looks at the state of the entire repository, not just the files of interest at a particular time. e.g. the workflow would be like this:
Person A - does a pull and update on the repository. (let's assume there are no local changes so this is straightforward).
Person A - edits whichever file(s) they're working on. Perhaps add or remove files.
Person B - someone else edits, commits, and pushes a different file at this point
Person A - commits changes. Tries to push. Gets an error about multiple heads.
Person A - does a pull and update. update doesn't work: merge required.
Person A - does a merge. If using TortoiseHg it's a bit confusing working out what to click on to do the merge. I guess this is simpler on the command line, provided there are no complications.
Person A - commits the merge.
Person A - pushes the changes.
My resistance is that there are more steps, and the merge step is somewhat hard to get your head around if you're not a developer. Is there a way I can put these steps together to make the process nice and simple?
"Very occasionally they'll hit conflicts where more than one person has edited a file. Almost always this is just because they forgot step #1. Because they're always working on separate files, there are (almost) never real conflicts. As long as they do step #1 first everything works fine."
If this is the case why do you want to use a DVCS? Mercurial is great, but the benefits of a DVCS come from the ability to merge and fork and the ease of doing either, if your workflow requires neither why would you want to switch toolset?
Sounds like the rebase extension might work for you. The workflow becomes:
hg clone
make changes
hg commit
hg pull --rebase
hg push
The local revisions get "rebased" onto the latest tip on pull, which avoids the merge.
One possible approach is to have a point person who does all the real work of merging. I'm not a big fan of letting everyone push to one shared repos, expecially if they don't know what they are doing. An alternative approach is that A has local repos A, B has local repos B, and there is repos S, which combines A and B. Then, don't let A or B push to S. Instead let an expert pull from A and B, and do the merging in S. Then A and B never have to push to S. If they coordinate with the expert, then he/she will already have merged their changes into S by the time they pull updates from S, and so A and B will not have to merge either when pulling. This is actually the default mode in which DVCS works, since by default all repositories are read-only except by their owner.