Locking down/securing TortoiseHg's web server - mercurial

I'm migrating a few projects from SVN to Mercurial and I'm not sure how to address this issue: because we are working with MVC 3, we have some SQL connection strings stored in our Web.config file.
Since TortoiseHg automatically starts a wide-open web-server when you click "Web Server" from the context menu, I'm looking into ways to restrict it or lock it down, but I haven't been having any luck. We obviously don't want anyone being able to browse or pull, which is enabled by default. While the simplest solution is just to not run it, it is entirely possible that a developer accidentally clicks it while trying to synchronize or clone, clicks X to close it, and then ends up with his local server without a clue.
How do other developers address this? Am I missing something? I've thought about pushing out a GPO blocking :8000 remote access, but there's nothing stopping a dev. from scrolling up and changing the ports or something silly.

After all clarifications, I still believe you're trying to solve the wrong problem.
hg serve is a legitimate tool that can be used to pull changesets between developers on the same network when it's too early to push those changesets to the server. It may or may not fit into your workflow, but I don't think the problem lies there.
If you expect malice, than nothing prevents any developer to expose the sensitive information in the Web.config (and, by the way, the source code itself) to the third party even you somehow block hg serve.
On the other hand, if you expect carelessness, then you should instruct the developers not to use hg serve, or stop storing any sensitive information there, possibly both.

Related

Is there a way to log everything that happens in Mercurial?

I'd need to keep a record of everything that happens with some repositories. By default I can keep a record of merging and commits, but I'd need a record of cloning, pulls, authentication, etc.
Is there a way to keep this logged somewhere?
Cloning, pulling, and authenticating are not part of a repository's history or state - all of that data is intentionally not tracked or pushed around by Mercurial.
In fact, Mercurial doesn't handle this behavior itself, it offloads this (particularly authentication) to the web server handling such requests. What you can do is look at the logs the server records. How this is tracked is very much server specific, but essentially you'd look at the access logs and see what is being requested.
Mercurial provides a lightweight web server hg serve for handling limited numbers of requests, and you can configure where both access and error logs are written to by this server with the -A and -E flags respectively. See hg help serve for more.
You could play with Mercurial's hooks and maybe log clones and pulls via that method, but authentication is completely transparent to Mercurial, so logging that must come from the server.
How are people accessing the repository. If it's over ssh you could easily log the command line that comes in via ssh. If it's over HTTP you can use your weblogs as a pretty fair record.
Mozilla has an open source 'pushlog' (example: https://hg.mozilla.org/mozilla-central/pushloghtml ) that they use to record all writes. You could probably easily tweak that code to record all reads too.

Should I make my repository my DocumentRoot for my website?

I setup mercurial on my server, but I am unclear how things should be. I am looking for more examples of different setups, but perhaps I am using the wrong keywords. Right now, it is only going to be a handful of developers, and I am unsure if I should just make the repo as the DocumentRoot. I really don't know what questions to ask since this is new to me, but I would appreciate it if anyone could provide some knowledge and guidance. Some questions that I do have right now is, how I should setup my servers and repositories? Should I setup a separate VirtualHost for a test clone before making it live? Anything would be helpful! Thanks in advance!
There's probably not a reason to do this. I would keep them separate but set up an automated process (either a custom script or continuous integration (CI)) to deploy from Mercurial to the site by running a single command. Optionally, you can make every commit trigger a deployment.
EDIT: With continuous integration, it is the CI's server's responsibility for deploying. If you use SSH, the CI would pull from hg, export, then upload through SSH. That should address your issues. For a comparison of CI servers that support Mercurial, see this question.
I don't have The answer to give you, since many variables and need affect the workflow, but here is some links to get you started :
http://www.zdnetasia.com/a-development-workflow-for-mercurial-62204755.htm
https://www.mercurial-scm.org/wiki/Workflows
http://www.webdevelopment.nicholastuck.com/tools/one-project-one-repository-mercurial-used-right/
I will also recommend you to read this excellent Mercurial introduction : http://hginit.com/
You can also find various questions on SO about workflows with Mercurial, have a look on the sidebars to the right for example.
When you will have some more specific question, don't hesitate to ask again !
I would make your DocumentRoot directory a first-level subdirectory of your repository, and here's some reasons why:
If you're using something like Apache to manage your server, you could put other meta-information - like sites-available and sites-enabled configuration files - in a sibling directory, since they're not really a part of the website documents.
Similarly, you can keep a "docs" directory right next to the code.
If your repository root is your DocumentRoot, all other things being equal, you are also serving up your .hg directory, where your whole repository history is, and your .hgignore file, that kind of thing. You can fix this with a .htaccess file, of course, but it's simpler just to have the child folder.
Essentially, codebases tend not to be exactly one-to-one matches with deployed sites, so I tend to favor having the document root be a subdirectory.
Deployment is a whole 'nother can of worms. It really depends on your needs as to what you do, but here's what I do:
I run a VirtualBox instance on my computer that looks as close as possible to what my deployed server looks like, at least as close as I can get the configuration files to be. I would argue that this approach is less error-prone than an additional VirtualHost entry. Depending on the project, I can get this down to being identical minus perhaps some DNS entries, so I can set everything up to either point to testing.myproject or production.myproject, and this I always automate (I use chef, but that is overkill for a smaller project) so that it's testable code and not prone to finger-fumbling. There's nothing worse than running smoke tests that wipe your database - and have the config accidentally pointing to your prod db. Running a virtual machine lets you painlessly test upgrades to the environment or OS of your server, and you can nuke and restore to a snapshot if you want to go to an earlier state of the machine's configuration.
If you really want to prevent SSH developer access to your prod machines - and IMO, that's a bad idea, because if you have problems on your production server, you've prevented your developers from diagnosing or fixing it - then I think your best bet is to use something like hudson, which is a continuous integration framework. You only give ssh access to the Hudson user to run your deploy script, but anyone (with the right privileges set in Hudson) can run that job. In fact, this is handy to have in an environment where you have e.g. some product management members you want to have the ability to update the production server without being able to log in. The "poor man's" version of this is using sudo to allow your devs to run a command as another user who does have ssh access - and only allowing them to run the publish script.
I would still recommend giving your devs access to your machine, though you don't have to hand over the keys to the kingdom. Just create a "developers" group, assign your devs to it, and give it enough permissions to play with the necessary directories of the server, and you should be good to go.

How to allow self signed cert in Mercurial through server configuration?

I've read similiar questions here and elsewhere. This is not intended to be a duplicate, but I haven't found the answer.
I'm trying to ask a very particular question, so please don't mark this as duplicate unless you can point me somewhere with a very specific correct answer.
I'm running CentOS 6 and I have Mercurial 1.9 installed as our Mercurial Server.
I can add repositories and and I can clone, and commit changes, and push back to the server with no problems as long as I don't try to use SSL.
The apache website is configured with a self signed SSL cert (I am aware of the pros and cons around self signed SSL certs, but we have made the decision to use one unless it is technically impossible).
Our client machines are Windows 7 with TortoiseHG 2.1.4 installed. In Visual Studio 2010 I'm using "Mercurial Source Control Package".
What I would like to do, is make a server configuration change that would either on a server level or repository level allow a self signed certificate.
Per client machine changes are burdensom because even after I update everyones machine, next time I have to setup a new client I have to have these changes documented and remember to go back through the steps.
I've tried the hostfingerprints option but I haven't been able to get it to work. I'm not sure if this is supposed to work as a server configuration or if I'm putting the setting in the correct file or what.
As a side note, I finally found how to turn on --insecure through the TortoiseHG UI (clicking the lock icon), but it looks like the visual studio source control provider doesn't have an option (at least that I can find).
I'm not a Linux expert (but I have access to experts if needed) so please be verbose in your explanations.
Everyone in our organization is an HG novise.
As a last resort, we may just get an SSL cert.
Jamie F is correct, but I'll put it down here since s/he didn't. There is nothing a server can do to tell a client to trust it -- there would be little point in that. You need to either configure your clients or use a certificate signed by a CA that your client systems already trust.

Using Mercurial (hg) to keep track of changes and synchronize automatically?

First off, I've been staring at page after page of solutions but none of them seem to fit the situation I have.
I have web developers all around the country using Windows workstations with Eclipse. We decided DVCS was best for us because the centralized system just isn't working (Serena: slow network connections takes forever to check in... they don't do it because it's not "streamlined", etc.)
We use Eclipse to edit and modify files on a development server in a different state. (Most DVCS scenarios assume you have a web server setup on your workstation or are doing binary executable development.)
What I'd like to try is to have a local repository for developer changes and "feature play" but automatically keep the development repository up to date. I thought of using Mercurial hooks to automatically pull/update/merge/push but that requires the developer to commit every time they want to test a change. (in order to fire the hook to upload their file to the development server.) It would be ideal to have this automatically happen on file saves because it's already an issue with training people to use version control (mainly because it's PITA slow currently with the WAN and virtual locations. Getting the WAN upgraded is not an option.)
My guess is that I'm going to have to setup Unison or something to keep the developer's repository synced to the development server as if it were a local copy and that of course would sync with the other developers. I was trying to find out if anyone had a solution that's streamlined/simple for keeping all developers up to date while allowing them to version control at will (easily.)
This is what branches are for. Have a branch called "unstable" and set your development server up with a repository that auto-updates on commits/merges to only that branch (via hook). Individual developers are free to work on feature branches and commit locally. When something is ready to be shared, the developer merges their changes into the "unstable" branch and pushes that branch to the development server/repository.
I manage my deployments this way. My Web server virtual host web root points to a Mercurial repository/working copy. When something is ready to go out, I merge it into the "stable" branch and push "stable" to the server. The repository hook updates the files on the virtual host with the new changes.
[hooks]
changegroup = /usr/bin/hg update stable >&2
I've only been doing this for a month or so but it's been working like a charm.
Also, +1 on checking out Hudson/Jenkins. What you're looking for is called "continuous integration" (CI) and Hudson and Jenkins make that happen.
If you can't have a development server on all workstations, then maybe using Test Driven Development with some sort of Unit Test framework for your language would work - most of those don't require a full server implementation to be able to write and test your code. It would require a change in paradigm but you'd sure get better quality of code out of that.
We're very happy with mercurial, but we had to change our habits... and it took some time
Each dev has now a test platform locally. Commits are made through branches and nothing is pushed before tests have been validated locally
Then, Hudson is our friend. To integrate teams works, each commit results in a build that tests the integrity of the app. Red signify rollback and return to the dev. Green is cool
Devs are here to commit 'sane' code and integrate by team to the central repo. They must decide when it's the time to push. No synchronized task can deliver them from this burden. When I see all the errors that can happen, even if each push choice is made consciensciouly by a human, I can't imagine if changes happened automatically on each file save
Good experience with Mercurial...
You say that you want to have a local repository for developer changes, but automatically push any changes to the server. If you cannot have a local dev environment to test changes on, what is the point of having local development branches? If your testing must be done on the development server, there is no way I can think of to allow for "feature play" in local repositories while maintaining any kind of sanity on your development server.
Your best bet in this scenario would probably be branching on the development server and having the server checkout different branches to test different features (hg update -C feature-blah). The default state for the server's repository should be a checkout of the main "devel" branch (hg update -C devel), and when any features or bugfix branches are verified as working they are merged back into "devel" and the server's repository updated from that.
Edit to clarify: your developers would either checkout from "devel" or from a feature branch to their local machine. They would then make any edits and push it back up to the server, and then switch the server's active branch to the newly-updated code.
Also, I am assuming from your other comments that there is only one dev server and it is only able to run one version of the code at a time. If this is not the case, my answer makes no sense at all.

Mercurial local repository backup

I'm a big fan of backing things up. I keep my important school essays and such in a folder of my Dropbox. I make sure that all of my photos are duplicated to an external drive. I have a home server where I keep important files mirrored across two drives inside the server (like a software RAID 1).
So for my code, I have always used Subversion to back it up. I keep the trunk folder with a stable copy of my application, but then I create a branch named with my username, and inside there is my working copy. I make very few changes between commits to that branch, with the understanding that the code in there is my backup.
Now I'm looking into Mercurial, and I must admit I haven't truly used it yet so I may have this all wrong. But it seems to me that you have a server-side repository, and then you clone it to a working directory in the form of a local repository. Then as you work on something, you make commits to that local repository, and when things are in a state to be shared with others, you hg push to the parent repository on the server.
Between pushes of stable, tested, bug-free code, where is the backup?
After doing some thinking, I've come to the conclusion that it is not meant for backup purposes and it assumes you've handled that on your own. I guess I need to keep my Mercurial local repositories in my dropbox or some other backed-up location, since my in-progress code is not pushed to the server.
Is this pretty much it, or have I missed something? If you use Mercurial, how do you backup your local repositories? If you had turned on your computer this morning and your hard drive went up in flames (or, more likely, the read head went bad, or the OS corrupted itself, ...), what would be lost? If you spent the past week developing a module, writing test cases for it, documenting and commenting it, and then a virus wipes your local repository away, isn't that the only copy?
So then on the flip side, do you create a remote repository for every local repository and push to it all the time?
How do you find a balance? How do you ensure your code is backed up? Where is the line between using Mercurial as backup, and using a local filesystem backup utility to keep your local repositories safe?
It's ok thinking of Subversion as a 'backup', but it's only really doing that by virtue of being on a separate machine, which isn't really intrinsic to Subversion. If your Subversion server was the same machine as your development machine - not uncommon in the Linux world - you're not really backed up in the sense of having protection from hardware failure, theft, fire, etc. And in fact, there is some data in that case that is not backed up at all - your current code may exist in two places but everything else in the repository (eg. the revision history) only exists in one place, on the remote server.
It's exactly the same for Mercurial except that you've taken away the need for a separate server and thus made it so that you have to explicitly think about backing up rather than it being a side-effect of needing to have a server somewhere. You can definitely set up another Mercurial repository somewhere and push your changes to that periodically and consider that your backup. Alternatively, simply backup your local repository in the same way that you'd back up any other important directory. With you having a full copy of the repository locally, including all revision history and other meta data, this is arguably even more convenient and safe than the way you currently do it with Subversion.
The "hidden" .hg directory stores all of the local commits. You can back up this directory using a standard backup program.
The changes get to the remote directory only when you push. Commits stay local, but you get them if you clone your repository. Then, yes, if you want your things to get to the server repository you have to push to it "all the time".
On the other hand, nothing stops you to have several machines and push content from one to another. Every mercurial repository can turn itself into a server in a matter of seconds typing "hg serve".
I'm not sure it really answer to you question, but I too am a big fan of backup and manage things this way with many clones of my repository (I also use massively mq to work in patch mode but that's another story).
PS: as a sidenote, I'm considering to use mercurial as a tool for filesystem backup. The only thing that bother me is that for this purpose I would prefer to disable the diff feature and treat all files as binary, but that should be easy.