I need to collaborate on a Mercurial repository (let's call it "foo") with some people who are novices at version control in general, Mercurial in particular.
I am trying to come up with a workflow that will enable us to use Mercurial without a lot of extra effort on either their end (confusion) or my end (cleanup).
My concern is that as novices I need to expect them to make errors, and I need to allow them to do so in a controlled way, otherwise they won't use the tool at all because they're too scared. But I don't want a bad change to pollute the repository unnecessarily.
I do not expect them to be able to merge properly or to use the mq extension. This is not
a matter of underestimating them, instead it is a realistic assessment given past experience with SVN and my own experience with Hg.
Which of the following approaches would make the most sense? Or if there's a better approach, what is it?
We have a repository foo-submit, read/writable by all, and a repository foo-trunk, readable by all but writable by admins. Users pull from foo-trunk, and push changes to foo-submit. Cleanup: If I find a good change, I let it through as is; if I find a bad change, I "bypass" it by merging with the previous version.
We have a repository foo-trunk readable by all, writable by admins. Each user is responsible for maintaining their own clone which is read-accessible to the rest of the team. When someone wants to push a change, they let me know and I pull it from their repository, with proper cleanup as necessary (same as in #1)
We have a repository foo-dev, read/writable by all, and a repository foo-trunk, readable by all but writable by admins. Users pull/push to foo-dev, and work in named branches if they need to do extensive development. I am responsible for performing merges and cleanup. The foo-trunk repository is merely for having a "clean" copy that has branches where the tip is always in a good state.
Good question, and one that I've never seen a great answer to.
That said, I like option 2. This is the "Pull Request" model used by the Linux Kernel and made popular by GitHub. It allows the admins to act as gatekeepers / reviewers, only allowing good change-sets to get past them when they're happy. If they decide a developer hasn't delivered something worthy, then the pull request is rejected (with reasons). Then the developer can go away, fix up their code / repo, and submit another pull request.
Running a server with something like RhodeCode on it can help keep on top of pull requests. As things grow you can have lower level gatekeepers that deal with subsystems, and higher level gatekeepers that deal with the whole project.
The bit I've never quite got my head around is what should happen to change-sets that are rejected, and that the developer decides to abandon rather than fix up and try again. They could be closed, but then could possibly appear by mistake as part of a future pull request. They'd be harmless, but possibly confusing. The alternative is stripping them, but that sounds like giving people tools they'll cut themselves on.
The other 2 options you give deserve a little comment.
1 is similar to 2. You're still doing a "Pull Request" type flow, but now you have server side branches which mirror the developer's clones. There's little difference and this is how a RhodeCode, GitHub, BitBucket server would let you work, except you don't have to go searching for changes. The server would tell you they're waiting for you to look at.
3 has the problem that everyone's changes are all merged together on foo-dev before you get to them. They would start becoming inter-dependent, and cherry-picking is going to be messy. You'd probably end up grafting change-sets on to foo-trunk which means you're creating new change-sets with new hashes. When the developers pull those they'll now have the change in two places; their original foo-dev version and your grafted foo-trunk version. This doesn't sound sustainable to me.
Best way i can think of if you don't want to use mq (understand with the least hassle for you) is to have your dev
create their own branch for the current feature being developped
merge it back to the main dev branch (or graft/transplant) when it's completed and validated
and then close the branch.
In the long term see for them to learn mq, it's not too hard to grasp.
3a - foo-dev has protected default branch (only some admins can push/merge-to this branch), users use named branches
Related
I've thought this through and I think I understand the implications, but I wanted to get a sanity check because the caveats on https://www.mercurial-scm.org/wiki/ShareExtension are pretty general.
Specifically, the warning is "It's probably not a good idea to mix MQ and shared clones; if you do so, you should definitely avoid pushing/popping patches in one clone while another clone has patches applied."
However based on my understanding of how Mq works, it's only unsafe to push/pop patches (create/destroy history) if you have two shares whose working directory parent would be affected by such changes. That is, if you have two shares that are updated to separate named branches, pushing/popping patches from one should only have the effect on the other of creating/destroying history that is unrelated to the working directory and thus should NOT have any undesirable side-effects.
There will be small side effects, such as revision sequence number changes in some situations, but nothing that should jeopardize correctness or cause problems with the working directory.
Is this correct or am I missing something?
I'm not sure about this, but AFAIK if you end up having the exact same file content in both branches (repos), this might still wind up as shared storage and wreak havoc.
This is obviously not a definitive answer, but I just wanted to report back in case anyone else is interested in this situation. I've been running for several months now with multiple shares on a "central copy" of a large repo, with each share being dedicated to its own branch, and using MQ freely within each share. I have not hit any problems. History changes on other branches just look the same as pulls/strips would -- unrelated changesets being added, modified, and removed.
My company is moving from Subversion to Mercurial. One of the reasons is with Hg we hope to be able to work more independently.
We are looking forward to using rebasing as our primary way to update from the main repository, at least in the beginning, to keep the history in one line, making the transition from Subversion easier.
Right now, if we need to work independently, we have two options: create a branch in Subversion, and commit there (aka merge hell), or not commit at all. With Mercurial we hope to be able to keep committing locally, and rebase every so often, thus gaining independence while staying free of the administration costs of bcreating named branches.
This all sounds cool until backing up comes into the picture. With Subversion it was obvious that if someone didn't commit, their work ccould be lost. But not committing became inconvenient quickly (no history, no log messages etc.), so people would commit time and again nevertheless.
With Mercurial it is going to be possible to go on committing and rebasing without pushing for extended periods of time, putting much more work at risk. So a question comes up: how to back up the stuff on developers' machines?
One solution would be to use some external backup software, but that doesn't sound like a very good idea.
We could also push to the main repo all the time (maybe even automatically?), but this would make it impossible to use rebasing, and would result in a lot of dangling heads in main.
We could push to a backup repo, and try to have only one head in the main repo. This sound a lot complicated.
Are there any other ways to do this? I'd like to find a solution that would let our developers use most of their Subversion knowledge in the beginning.
Just to throw this out there: I think you're making a mistake. A linear history isn't a big deal and pull/merge is the much more normal mercurial workflow. Embrace a non-linear history and leave rebase for special occasions.
You say "With Mercurial we hope to be able to keep committing locally, and rebase every so often, thus gaining independence while staying free of the administration costs of bcreating named branches.", but in mercurial one uses unnamed branches for this, so there's no administration costs.
See http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/#branching-anonymously for an explanation of how unnamed branches created automatically will give you exactly what you want with zero hassle.
I know it sounds like too-good-to-be-true snake oil, but your folks can just hg pull and hg merge and hg push when they're through and without anyone having to think about branch names, or who has what clone, or anything, you'll have a coordinated center with disconnected work.
You could also give each developer a repo:
https://example.com/repos/awesome-product
https://example.com/repos/awesome-product-wolever
https://example.com/repos/awesome-product-pintér
Where they can push to all they want.
It would mean a lot of dangling heads, but that's probably OK because nobody would look at them until it comes time to restore… Then you'd simply ask for only the ancestors of the current tip:
hg pull https://example.com/repos/awesome-product-wolever -r tip
If I were using this scheme, I'd set:
[paths]
backup = https://example.com/repos/awesome-product-wolever
in .hgrc/hgrc.
Alternatively, you could give each dev one overall backup repo by setting this in their global ~/.hgrc:
[paths]
backup = https://example.com/repos/wolever-backup
I think you're doing it the wrong way.
What you describe looks more like you're trying to put in place a tool to ensure/enforce a process, rather than to support the process.
Working in a decentralized way is quite a big shift compare to the centralized approach, and by trying to put the kind constraint you describe will more than likely cause more hurt than good for the experience perceived by your developers.
If you're scared that they won't push their changes on a regular basis to a main repository, then it would be good to ask yourself (and to ask them too) "Are we ready to change our process?"
Many people still claims that the DVCS will bring chaos to their software factory. It's true that the DVCS'es don't enforce a linear (single-point-of-failure) way to work, but what bring chaos is not the tool, it is the team spirit you built in your company.
Now, if you HAVE TO migrate to Mercurial (because it's already sold to the management level, or whatever), but feels more comfortable with the SVN approach, with a little "plus", try using HgSubversion first. Migration will still be easy later.
I'm struggling to find the mercurial workflow that fits the way that we work.
I'm currently favouring a clone per feature but that is quite a change in mindset moving from Subversion. We'll also have issues with the current expense we have in setting up environments.
Using hg pull --rebase seems to give us more of a Subversion-like workflow but from reading around I'm wary of using it.
I think I understand the concepts and I can see that rewriting the history is not ideal but I can't seem to come up with any scenarios which I personally would consider unacceptable.
I'd like to know what are the 'worst' scenarios that hg pull --rebase could create either theoretical or from experience. I'd like concrete examples rather than views on whether you 'should' rewrite history. Not that I'm against people having opinions, just that there already seem to be a lot of them expressed on the internet without many examples to back them up ;)
The first thing new Mercurial converts need to learn is to get comfortable committing incomplete code. Subversion taught us that you shouldn't commit broken code. Now it's time to unlearn that habit. Committing frequently gives you a lot more flexibility in your workflow.
The main problem I see with hg pull --rebase is the ability to break a merge without any way to undo. The DVCS model is based on the idea of tracking history explicitly, and rebasing subverts that idea by saying that all of my changes came after all of your changes, even though we were really working on them at the same time. And because I don't know what your changes are (because I was basing my code off of earlier changesets) it's harder for me to know that my code, on top of yours, won't break something. You also lose the branching capabilities by rebasing, which is really the whole idea behind DVCSs.
Our workflow (which we've built an entire Mercurial hosting system around) is based on keeping multiple clones, or branch repositories, as we call them. Each dev or small team has their own branch repository, which is just a clone of the "central" repository. All of my new features and large bug fixes go into my personal branch repo. I can get that code peer reviewed, and once it's deemed ready, I can merge it into the central repo.
This gives me a few nice benefits. First, I won't be breaking the build, as all of my changes are in their own repo until they're "ready". Second, I can make another branch repo if I need to do a separate feature, or if I have something longer-running, like for the next major version. And third, I can easily get a change into the central repo if there's a bug that needs to be fixed quickly.
That said, there are a couple different ways you can use this workflow. The most simple, and the one I started with, is just keeping separate clones. So I'll have website-central, website-tghw, etc. It works well, especially since you can push and pull between them locally. More recently, I've started keeping multiple heads in the same repo, using the remotebranches extension to help manage them and hg nudge to keep from pushing everything at once.
Of course, some people don't like this workflow as much, usually because their Mercurial server makes it hard to make server-side clones. In that case, you can also look at using named branches to help keep your features straight. Unfortunately, they're not quite as flexible as Git branches (which is why we prefer branch repos) but they work well once you understand how to close branches, and why you can't really get rid of them once you start one.
This is getting a bit long, so I'll wrap it up by encouraging you to embrace the superior branching and merging that Mercurial provides (over SVN). There is definitely a learning curve, but once you get the hang of it, it really does make things easier.
From the question comments, your root issue is that you have developers working on several features/bug fixes/issues at one time and having uncommitted work in their working directory along with some completed work that is ready to be pushed back to the central repository.
There's a really nice exchange that covers the issue well and leads on to a number of ways forward.
http://thread.gmane.org/gmane.comp.version-control.mercurial.general/19704
There are ways you can get around keeping your uncommitted changes, e.g. by having a separate clone to handle merges, but my advice would be to embrace the distributed way of working and commit as often as you like - if you really feel the need you can combine the last few local commits into a single changeset (using MQ, for example) before pushing.
Im in the process of trying to get my head round a dvcs such as mercurial. Im getting quite confused with certain points though. Firstly, a bit of context:
At the minute i mostly use subversion, and it works fine for my workflow,
Mostly the repository is for my own use, im the only web developer,and i only ever submit raw code to my manager, he never has to see the repository.
I use the repo to create major versions, and as backup so i can revert to it when something doesnt work out.
The repo also acts a file share, enabling me to work from the same codebase at work and at home.
My main reason for wanting to switch to mercurial, is the offline commits and easier branching / merging.
Firstly can anyone tell me how i would get mercurial to fit this workflow?
How do i go about sharing multiple repositories (i.e. one for each project) between computers?
Any help would be hugely appreciated,
Thanks
http://hginit.com/
There is a fantastic pre-chapter there specifically for SVN users. The rest of the tutorial will get you on your feet fairly quickly.
I'll answer just one part of your question, that of how to manage access to your repository from both home and work, because this is one of the situations where distributed version control is really useful.
The answer is that your two repositories are clones of one-another (to be correct, one is the clone of the other). You do some work during the day, check it in, then pull that work to your home repository (or push, but that requires more work). The next morning, you do the same thing in reverse. Mercurial comes with a built-in read-only HTTP server that makes it really easy, provided that you can expose a port.
The end result is that you have two repositories (ie, automatic backup of the entire history). At any given point in time, one is "better" than the other, but since you're the sole committer to both, they won't diverge.
I have several projects with a very large over-lapping code-base. We've just recently started using SVN so I'm trying to figure out how I should be using it.
The problem is that as I'm finishing a task on one project, I'm starting a task on another, with some overlap. Often there's a lot of interrupt driven development as well. So, my code is never really in a completely stable state that I feel comfortable checking in.
The result is that we're not really using the VC system, which is a VERY bad thing, we all know... so, suggestions?
Check out a personal branch of the code and merge in changes. At least you will have some version control for your own changes, in case you need to roll back. Once you are comfortable with the state that your branch is in, merge that branch back into the trunk.
You can also check out a branch for each task, instead of one for each individual. You can also merge changes to your branch from the trunk if someone changes the trunk, and you want your branch to reflect the changes.
This is a common way to use SVN, although there are other workflows. I have worked on projects where I was afraid to commit(I would break the build possibly) because we did not effectively use branching.
Branching is really powerful in helping your workflow, use it until you're comfortable with the idea of merging.
Edit: 'Checking out a branch' refers to creating branch in your branches folder, and then checking out that branch. The standard svn repository structure consists of the folders trunk, tags, and branches at the root.
So, my code is never really in a completely stable state that I feel comfortable checking in.
Why is that ?
If your branch is appropriate for your work (with a good naming convention for instance), everyone will know its HEAD is not always stable.
In this kind of "working" branch, just put some tag along the way to indicate some "stable code points" (which can then be queried by any tester to be deployed).
Any other version on that working branch is just made to record changes, even though the current state is not stable.
Then later you merge all on a branch supposed to represent a stable state.
In TFS, you are able to create 'Shelf Sets' (I'm not sure what they'd be called in other source control providers). When you shelve some code, you are saving it to your repository, but not checking it in.
The reason this is important is that if you are working on Bug XXXX, and you fix half of the code, but it's not stable and not 'check-in-able', but you get assigned to NewFeature YYYY, you SHOULD NOT continue working with the same code base. You should 'Shelf' your Bug XXXX code, then return your local codebase to the latest checked-in code, and implement NewFeature YYYY.
This way you are keeping your check-ins atomic. You don't have to worry about losing your work, because it is still held by the repository (so if your computer bursts into flames, you don't have to burst into tears), and you aren't mixing your fixes for XXXX with your new code for YYYY.
Then, once you are asked to go back to XXXX (assuming you've checked in YYYY) you can just unshelve your 'shelf set' and jump right back into it where you left off.
Either accept that the code in SVN is not in a completely stable state and check it in anyway (and reserve time for stabilization and refactoring every X days/weeks so the code doesn't degrade too much).
Or force your team to work in a more structured way with minimal interruption based development so you can check in good code.
The first option is not ideal (but better then no source control), the second is probably impossible - there is no third option.
If you don't have time to get the code to a stable state you defiantly don't have the time to branch and merge all the time.
In distributed sourcecontrol systems like GIT, you commit to your local repository. Only when you push your code, it's 'committed' to the remote repository.
In this way, its much easier to 'safe' your work in between.