Mercurial distributed repositories

Mercurial distributed repositories - mercurial

Having myself found in a role of a build engineer and a systems guy I had to learn end figure out a few things - namely how to set up our infrastructure. Before I came on board they didn't have any. With this in mind please excuse me if I ask anything that should have been obvious.
We currently have 3 level distributed mercurial repositories: level one on each of developer machines, level two on central (trunk) server - only accessible from local network and the third layer on BitBucket. Workflow is as follows:
Local development: developer pulls change-sets from local network server. developer commits to local and pushes to our local server once merge conflicts are resolved. A scheduled script overnight backs everything up to BitBucket.
Working from home: developer pulls change-sets from BitBucket. Developer comits to their local repo and push to BitBucket.
TeamCity picks up repo changes from local network server for each project and runs a build / automated deploy to test environment.
The issue I'm hitting is scenario 2: at the moment if someone pushes something to bitbucket it's their responsibility to merge it back when they're back in office. And it's a bit of a time waster if it could be automated.
In case you're wondering, the reason we have a central repo on local network is because it would be slow to run TeamCity builds of BitBucket repositories. Haven't tested so it's just an educated guess.
Anyhow, the script that is scheduled and pushes all changes from central repository on local network just runs a "hg push" for each of repositories. It would have to do a pull / merge beforehand. How do I do this right?
This is what the pull would have to use switches for:
- update after pull
- in case of merge conflicts, always take newer file
- in case of error, send an email to system administrator(s)
- anything extra?
Please feel free to share your own setup as long as it's not vastly different to what's described.
UPDATE: In light of recent answers I feel an important aspect if the intended approach needs to be clarified. The idea is not to force merges on our local network central repo. Instead it should resolve merge conflicts in same was as per using HgWorkbench on developer machines with post pull: update + merge. All developers have this on by default so it should be OK.
So the script / batch file on server would do the following:
pull from BitBucket
update + auto merge
Any merge auto conflicts?
3.1 Yes -> Send an email to administrators to manually merge -> Break
3.2 No -> Cary on
Get outgoing changesets. Will push create multiple heads? (This might be redundant because of pull / update)
4.1 Yes -> Prompt administrators. Break.
4.2 No -> Push changes
Hope this clears things up a bit. Now, can this be done using hg commands alone - batch - or do I have to script it? Specifically can it send emails?
Thanks.

So all your work is available at BitBucket, right? Why not make BitBucket (as available from anywhere) you primary repo source and dropping your local servers? You can pull changes from BitBucket with TeamCity for your nightly builds and developers whould always work with current repo at BitBucket and resolve all merging problems themselves so there wouldn't be any subsequent merges for you.

I would not try to automatically merge the changes if they are conflicting, this will only lead to broken and inconsistent versions and "lost" changes causing confusion and chaos. Don't merge it automatically if it isn't clear how that merge should look like.
A better alternative would be to just keep the two heads around and push/pull them without merging. This way everybody still can get that version of the data he was working on from work/home. A manual merge will have to be done, but this can also be done at work or from home, enabling developers to resolve the issue from wherever they are. You can also send emails around in this scenario to make sure everybody is aware of the problem.

I guess that you could automize this using a script, I would try PowerShell if I were you. However, sometimes this may require manual change merges when there are conflicts (because when developers commit changes to both BB and local repos, these changes might be conflicting).

Related

How to create disposable experimentation workflow in Mercurial?

Coming originally from SVN, I am still new to Mercurial.
I am interested in creating an experimental workflow to see if I can rewrite a troubled feature from scratch. If my attempt fails though, I wish to delete the experimental workflow - abandoning the work — with nobody else ever seeing it.
The problem is though I still need to push changes of this experimental workflow across laptops and PCs and keep working for a couple of weeks. But still keep the option open to delete that branch and fall back to the main branch, without having any trace of the experimental branch.
Is something like this possible in Mercurial and how could I achieve this?
FYI, I am using mercurialeclipse plugin on Aptana Studio 3.0. (so I
use a UI but commands should be fine too)

After changeset is pushed to the central server (assuming you have one) - there is no way to remove it from there.
So the possible (but terribly inconvenient) solution for you now could be to create a personal separated repository and synchronize your devices using it. And if you like the result - you push to the shared central repo then. Otherwise you just delete the temporary repository.

With a Distributed Version Control System like Mercurial you can sync between any clone of a repository, not just a "central" one that all users have agreed to use.
Therefore, you can:
Clone the repository to private a share that the systems "experimenting" can access.
Clone to a USB key and move that between systems.
Use hg serve to start a web server for a local repository on a system and clone and pull that history to other systems.
Use hg bundle/unbundle to package up new history and email it to another system.
To abandon work, just delete all these extra clones and clone from the common "central" repository again.

Can I work in the repository in a single user Mercurial workflow?

I use Mercurial in a single-user workflow to have the option to roll back changes if my coding or writing goes horribly wrong (I primarily use the Stata and R statistics packages and LaTeX). While working only locally, this has been easy since all I have is the main repo.
Recently I have started ssh-ing into a Linux server for more computational power. So far I have been manually copying files back and forth and using Mercurial only locally, but I would like to use Mercurial to take care of this and keep these two workflows synchronized. Also, I like the ability to code both locally (on my laptop or desktop) and on the server.
Do I need to work on a clone of the main repo on the server and keep the main repo untouched? Or can I work directly in the main repo when I am on the server? In this question #gizmo points to this workflow guide; the "single developer" discussion is helpful, but it's still not clear to me that I can work in the main repo while I'm on the server without causing some major problem that I don't yet understand.
Thanks!
Edit: I should add that I have worked through Joel Spolsky's HgInit.com tutorial and I'm comfortable pushing/pulling/cloning/etc over ssh, but I am still not sure if I can work in the main repo without causing heartache later. Or maybe this is more a philosophical question? Thanks!

Mercurial is DVCS, it means - in each location you have both: local working copy and local repository
Mercurial is DVCS, it means - you can freely exchange (pull|push) data between repos (if they provide remote-access methods).
If you
comfortable pushing/pulling/cloning/etc over ssh
and don't forget perform pull|push cycle around your work at home (in order to don't run hg serve at home-host and sync from server as source) you don't get any headache at all with perfect linear aggregated history on each place. And even you forget to sync repo sometimes, you get in worst case two heads later, which you'll be able to merge easy (doesn't know formats of Stata and R data-files, but LaTeX, as text, is mergeable)

There is no problem with working directly in the repository on your server. From Mercurial's point of view, the "main" repository is just another random repository — Mercurial doesn't consider it to be special.
You don't say this directly, but one thing that people ask is "What happens when I push to the server?" The answer is that hg push only sends data into the repository (the .hg/ folder). The working copy is not touched on the server when you push to it. Since you push new changesets to the server, you might need to run hg update the next time you work on the server. This is just like if you had run hg pull on the server — there you'll also merge or update afterwards.
I have this situation all the time: I create a repository at home and clone it to my computer at work. I change files in either location and push/pull between the two repositories. If I need to share my work with others, then I make a repository at Bitbucket and push the code there. That way Bitbucket serves as a nice canonical repository for the code and I typically change the default path to Bitbucket in the repositories at home and at work. So at home I would have:
[paths]
default = httsp://bitbucket.org/mg/<repo>/
work = ssh://mg#work/<repo>
so that I can do hg push to send things to Bitbucket and hg pull work to grab things directly from work (in case I forgot to push to Bitbucket before leaving).

Crucible and multiple mercurial clones handling

I was wondering if Crucible can handle the following scenario with Mercurial.
How do you use DVCSs with Crucible in such a scenario?
There are several issues in a project, for each issue a developer makes a clone of the project from repo "stable-build", to repo "dev-0001" (on a local sharing server).
Clone is named according to the issue : "dev-0001" for example.
Now from there a developer clones on his local machine into clone "local-dev-0001", makes the changes and then pushes to "dev-0001".
Some other developer wants to review the changes in repo "dev-0001" before the dev that implemented 0001 can push to "stable-build".
What I tried is to set up Crucible for a repo (a separate test clone "test-crucible" directly from "stable-build". It took a loong time on a very power full machine, about 5 days.
My question is : how can Crucible and Mercurial be set up so that one can create reviews for the "dev-0001" clone befor eit is pushed to a somewhat central server, withouth waiting 5 days for Crucible to parse the "dev-0001" repo from the start, and maybe use the information of it's parent ? Is this already done does it need some sort of plugin?
I can offer more clarity for the scenario if that was a bit hazzy,
Thanks

I think I'm discovering the answer might be no for this workflow without altering it. Also found this for who is searching the answer for this :
https://answers.atlassian.com/questions/8798/does-latest-version-support-revieiwing-from-local-repositories-using-mercurial

I think I found out a satisfactory answer for my case, a pre-commit patch file could be used,
obtained from making a diff in "dev-0001" before pushing to "stable-build" with : "hg outgoing -p > patch-0001"

How to setup a DVCS system that keeps some of the stuff we like about our centralized VCS?

Right now, we have a small team of developers using TFS for our version control. I'm evaluating the possibility of us moving to a DVCS, and am wondering if we'd need to give up some of the stuff we like about our current system if we moved to DVCS, or if we can find a way to support it.
Right now we a Stable branch, and 1 branch for each developer (you can think of each dev's branch as a feature branch that is reused from feature to feature). The stuff we like is:
1) Each time a dev checks in changes to his branch, we do a build and test of everything on the servers, followed by automatic deployment of all projects to that developers test environment.
2) Merging from any dev's branch to Stable is done by me, so that I can have 1 last check on what is happening to our stable branch before committing the changes.
3) If I want to help a dev with something, I can just grab latest from their branch and look at it on my machine.
I'm trying to understand how this could work with a DVCS (specifically we are testing with Mercurial).
I'm hoping to be able to pull off something like this:
1) We setup a central repository, and we create Main and Release branches in addition to 1 branch for each developer.
2) All devs clone the repository to their local machine.
3) All work is done in their personal branch against the local repository.
4) When they are done, the would pull from the central repository, and perform a local forward integration merge from Main to to their branch, to integrate any changes that have happened in the past to Main.
5) They would then push their changes to the central repository.
6) Some CI service would pickup this change, causing a build/test/deploy-to-dev of all our projects in that branch.
7) If everything was ok the dev would shoot me an email saying their branch was ready for merging to Main.
8) I could then merge in their changes, either by somehow connecting directly to the remove repository, or by doing a pull->merge->push.
So to deal with our requests:
1) I'm assuming there is some CI tools that can watch a branch in Mercurial, and kick off a build/test/deploy process (like CC.Net).
2) I can still manage the final merge process from DevX to Main either by connecting to the remote repo, or by pulling, merging and pushing through my local repo.
3) I believe I could either pull changes directly from another dev's repo, or I could just pull from the central repo, and then update my working directory to work on their code.
So do I have this mostly right?

Yep, you have all that right. Regarding you final assumptions:
1) There's a Mercurial Source Control Block for CruiseControl.NET. Another CI server I've heard of in use with Mercurial is Jenkins.
2) Correct. For integration with Main, I would prefer pulling (from either) and merging on my own machine before pushing, rather than merging on the server.
3) Exactly so.
It sounds like your developers are fairly disciplined, but just in case you need better control certain aspects of your operations:
You can use hooks to issue warnings when someone tries to merge their branch to Main. In-process hooks have to be written in Python, but they have access to the Mercurial API that way. You could also place hooks on the server that reject pushes containing a merge to Main not done by certain users.
One way some organizations control integration is a pull-only scenario. Only a few developers can push to the official repository and other developers send them pull requests. The Mercurial book's Chapter 6 covers this a bit, too.
A branch per developer is good. A branch per feature is also useful, allowing each developer to work on multiple things in parallel, then merging each to their branch when done. They just have to remember to close that feature branch before doing so, so the branch name doesn't keep popping up. This can be done with with clones as well, but I find myself preferring named branches since I have to keep work/backup/laptop development clones all synced so I can work on whateve, whenever. I still do expendable work as a clone first.

Moving from Subversion to Mercurial - how to adapt the workflow and staging/integration systems?

We got all psyched about from from svn to hg and as the development workflow is more or less flushed out, here remains the most difficult part - staging and integration system.
Hopefully this question goes a bit further then your common 'how do I move from xxx to Mercurial'. Please forgive long and probably poorly written question :)
We are web shop that does a lot of projects(mainly PHP and Zend), so we have one huge svn repo, with like 100+ folders, each representing a project with it's own tags,branches and trunk of course. On our integration and testing server(where QA and clients look at work results and test stuff) everything is pretty much automated - Apache is set to pick up new projects automatically creating vhost for each project/trunk; mysql migration scripts right there in trunk too and developers can apply them through simple web-interface. Long story short our workflow is this now:
Checkout code, do work, commit
Run update on the server via web interface(this basically does svn up on server on a particular project and also run db-migration script if needed)
QA changes on the server
This approach is certainly suboptimal for large projects when we have 2+ developers working on the same code. Branching in svn was only causing more headaches, well, hence moving to Mercurial. And here is where the question lies - how does one organize efficient staging/integration/testing server for this type of work(where you have many projects, say single developer could be working on 3 different projects in 1 day).
We decided to have 'default' branch tracking production essentially and then make all changes in individual branches. In this case though how can we automate staging updates for each branch? If earlier for one project we almost always were working on trunk, so we needed one DB, one vhost, etc. now we potentially talking about N-databases per project, N-vhost configs and etc. Then what about CI stuff(such as running phpDocumentor and/or unit tests)? Should it only be done on the 'default'? On branches?
I wonder how other teams solve this issue, perhaps some best practices that we're not using or overlooking?
Additional notes:
Probably worth mentioning that we've picked Kiln as a repo hosting service(mostly since we're using FogBugz anyway)

This is by no means the complete answer you'll eventually pick, but here are some tools that will likely factor into it:
repositories without working directories -- if you clone -U or hg update null you get a repository with no working directory (only the .hg). They're better on the server because they take up less room and no one is tempted to edit there
changegroup hooks
For that last one the changegroup hook runs whenever one or more changesets arrive via push or pull and you can have it do some interesting things such as:
push the changesets on to another repo depending on what has arrived
update the receiving repo's working directory
For example one could automate something like this using only the tools described above:
developer pushes five changesets to central-repo/project1/main
last changeset is on branch 'my-experiment' so csets are automatually re-pushed to optionally created repo central-repo/project1/my-experiment
central-repo/project1/my-experiment automatically does hg update tip which is certain to be on the my-expiriment branch
central-repo/project1/my-experiment automatically runs tests in its working dir and if they pass does a 'make dist' that deploys, which might set up database and vhost too
The biggie, and chapter 10 in the mercurial book covers this, is to not have the user waiting on that process. You want the user to push to a repo that contains possibly-okay-code and the automated processed do the CI and deploy work, which if it passes ends up being a likely-okay repo.
In the largest mercurial setup in which I've worked (20 or so developers) we got to the point where our CI system (Hudson) was pulling from the maybe-ok repos for each periodically then building and testing, and handling each branch separately.
Bottom line: all the tools you need to setup whatever you'd like probably already exist, but gluing them together will be one-off sort of work.

What you need to remember is that DVCS (vs. CVCS) introduces another dimension to versioning:
You don't have to rely anymore only on branching (and get a staging workspace from the right branch)
You now have with DVCS the publication workflow (push/pull between repo)
Meaning your staging environment is now a repo (with the full history of the project), checked out at a certain branch:
Many developers can push many different branches to that staging repo: the reconciliation process can be done in isolation within that repo, in a "main" branch of your choice.
Or they can pull that staging branch in their repo and test things out before pushing back.
From Joel's tutorial on Mercurial HgInit
A developer don't necessary have to commit for other to see: the publication process in a DVCS allows for him/her to pull the staging branch first, reconcile any conflict locally, and then push to the staging repo.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008