Should I make my repository my DocumentRoot for my website? - mercurial

I setup mercurial on my server, but I am unclear how things should be. I am looking for more examples of different setups, but perhaps I am using the wrong keywords. Right now, it is only going to be a handful of developers, and I am unsure if I should just make the repo as the DocumentRoot. I really don't know what questions to ask since this is new to me, but I would appreciate it if anyone could provide some knowledge and guidance. Some questions that I do have right now is, how I should setup my servers and repositories? Should I setup a separate VirtualHost for a test clone before making it live? Anything would be helpful! Thanks in advance!

There's probably not a reason to do this. I would keep them separate but set up an automated process (either a custom script or continuous integration (CI)) to deploy from Mercurial to the site by running a single command. Optionally, you can make every commit trigger a deployment.
EDIT: With continuous integration, it is the CI's server's responsibility for deploying. If you use SSH, the CI would pull from hg, export, then upload through SSH. That should address your issues. For a comparison of CI servers that support Mercurial, see this question.

I don't have The answer to give you, since many variables and need affect the workflow, but here is some links to get you started :
http://www.zdnetasia.com/a-development-workflow-for-mercurial-62204755.htm
https://www.mercurial-scm.org/wiki/Workflows
http://www.webdevelopment.nicholastuck.com/tools/one-project-one-repository-mercurial-used-right/
I will also recommend you to read this excellent Mercurial introduction : http://hginit.com/
You can also find various questions on SO about workflows with Mercurial, have a look on the sidebars to the right for example.
When you will have some more specific question, don't hesitate to ask again !

I would make your DocumentRoot directory a first-level subdirectory of your repository, and here's some reasons why:
If you're using something like Apache to manage your server, you could put other meta-information - like sites-available and sites-enabled configuration files - in a sibling directory, since they're not really a part of the website documents.
Similarly, you can keep a "docs" directory right next to the code.
If your repository root is your DocumentRoot, all other things being equal, you are also serving up your .hg directory, where your whole repository history is, and your .hgignore file, that kind of thing. You can fix this with a .htaccess file, of course, but it's simpler just to have the child folder.
Essentially, codebases tend not to be exactly one-to-one matches with deployed sites, so I tend to favor having the document root be a subdirectory.
Deployment is a whole 'nother can of worms. It really depends on your needs as to what you do, but here's what I do:
I run a VirtualBox instance on my computer that looks as close as possible to what my deployed server looks like, at least as close as I can get the configuration files to be. I would argue that this approach is less error-prone than an additional VirtualHost entry. Depending on the project, I can get this down to being identical minus perhaps some DNS entries, so I can set everything up to either point to testing.myproject or production.myproject, and this I always automate (I use chef, but that is overkill for a smaller project) so that it's testable code and not prone to finger-fumbling. There's nothing worse than running smoke tests that wipe your database - and have the config accidentally pointing to your prod db. Running a virtual machine lets you painlessly test upgrades to the environment or OS of your server, and you can nuke and restore to a snapshot if you want to go to an earlier state of the machine's configuration.
If you really want to prevent SSH developer access to your prod machines - and IMO, that's a bad idea, because if you have problems on your production server, you've prevented your developers from diagnosing or fixing it - then I think your best bet is to use something like hudson, which is a continuous integration framework. You only give ssh access to the Hudson user to run your deploy script, but anyone (with the right privileges set in Hudson) can run that job. In fact, this is handy to have in an environment where you have e.g. some product management members you want to have the ability to update the production server without being able to log in. The "poor man's" version of this is using sudo to allow your devs to run a command as another user who does have ssh access - and only allowing them to run the publish script.
I would still recommend giving your devs access to your machine, though you don't have to hand over the keys to the kingdom. Just create a "developers" group, assign your devs to it, and give it enough permissions to play with the necessary directories of the server, and you should be good to go.

Related

Using versioning on a VM with several users

We are looking for a way to use GitHub on an internal system that we are developing at work. We have developed it in PHP and MySQL, with a fair bit of jQuery/Ajax, on a Windows Server VM running IIS. Other staff can access the frontend over the network using the IP address.
There are currently three people working on it and at the moment we directly edit the file on the VM as we need it to still communicate with the database to check our changes have worked. There is no option to install anything like WAMP on our individual machines and there are the usual group policy restrictions so the only access we have to a database is via the VM. We have been working with copies of files/folders and the database but there is always the risk that then merging these would be a massive task.
I do use GitHub (mainly desktop but I can just about get by with using the command line as long as I have a list of the command in front of me) at home to sync between my PC and Laptop, via GitHub.com and believe that the issues we get with several people needing to update the same file would be eradicated by using it here at work.
However, there are some queries we need to ensure we have straight in our heads before putting forward a request.
Is what we are asking for viable? Can several branches on the same server be worked on at the same time or would this only work on an individual machine.
Given that our network is fairly restricted, is there any way that we can work on the files on our machine and connect to a VM hosted database? I believe that an IDE will allow us to run php files on a standard machine (although a request for Eclipse is now around 6 weeks old and there is still no confirmation that we will get it any time soon) but will this also allow .
The stuff we do is not overly sensitive but the company would certainly not want what we do out there in a public repository (and also would not be likely to pay for a premium GitHub account) so we would need to branch/pull/merge directly from our machines to the VM.
Does anyone have any advice/suggestions/solutions to this? Although GitHub would be a preferred option as I already use it, we are open to any suggestion that will allow three people, on different machines, simultaneously work on a central system while ensuring that we do not overwrite or affect each others stuff.
Setting up a git repo on Windows is not trivial and may require a fair bit of work. You can try using SVN it is fairly straight forward to install on windows and has a better learning curve than Git. I am not saying SVN is better/worse as compared to Git, it's much better suited to your needs. We have a similar setup and we use Tortoise SVN https://subversion.apache.org/ as a client. SVN also has branches and stuff.
SVN for server side repository https://subversion.apache.org/
If you would still prefer Git on windows, check this out - https://www.linkedin.com/pulse/step-guide-setup-secure-git-remote-repository-windows-nivedan-bamal
1) It is possible to work on many branches and then merge them into a single branch. That's the preferred Git development way. You can do the same on SVN.

Sincronize databases with git deployment

So I own a VPS server running CentOS, and decided to use git for deployment. Man! That's fun. Push, done!
I'm really happier than i was with the old ftp approach.
But I wish I could go further, today it deploys automagically all my files, but it doesn't even touch my db. And if I change it in the mods, I have to update it manually. So i was thinking about using some git hooks to do this also automatically.
By now I'm using one git hook at the server, it's a post-receive hook and basically copies files to the production directory when pushed to master.
The prerequisites for the DB deployment are:
It needs to go both ways, if i pull from db, and it's different from my local it should update my local db.
It should be based on modifications and patchs and not the dump of the whole db, this way i can work with the team without compromising other guys work.
I was thinking about keeping a db.sql on the version control, and make a script to analyze it on post-receive (on server) and post-merge(on local), so it can take the mods and apply, and i would keep a database of which mods were applied already (the script should run in both, client and server).
Any of you guys have already done something similar to this? What would you recommend?
Thank you very much already,

Locking down/securing TortoiseHg's web server

I'm migrating a few projects from SVN to Mercurial and I'm not sure how to address this issue: because we are working with MVC 3, we have some SQL connection strings stored in our Web.config file.
Since TortoiseHg automatically starts a wide-open web-server when you click "Web Server" from the context menu, I'm looking into ways to restrict it or lock it down, but I haven't been having any luck. We obviously don't want anyone being able to browse or pull, which is enabled by default. While the simplest solution is just to not run it, it is entirely possible that a developer accidentally clicks it while trying to synchronize or clone, clicks X to close it, and then ends up with his local server without a clue.
How do other developers address this? Am I missing something? I've thought about pushing out a GPO blocking :8000 remote access, but there's nothing stopping a dev. from scrolling up and changing the ports or something silly.
After all clarifications, I still believe you're trying to solve the wrong problem.
hg serve is a legitimate tool that can be used to pull changesets between developers on the same network when it's too early to push those changesets to the server. It may or may not fit into your workflow, but I don't think the problem lies there.
If you expect malice, than nothing prevents any developer to expose the sensitive information in the Web.config (and, by the way, the source code itself) to the third party even you somehow block hg serve.
On the other hand, if you expect carelessness, then you should instruct the developers not to use hg serve, or stop storing any sensitive information there, possibly both.

Hudson slaves, how to access workspace

Howto configure system to have one master and multiple slaves where building normal c-code with gmake? How slaves can access workspace from master? I guess NFS share is way to go, but if that's not possible any other options?
http://wiki.hudson-ci.org/display/HUDSON/Distributed+builds is there but cannot understand how workspace sharing is handled?
Rsync? From master: SCM job -> done -> rsync to all slaves -> build job and if was done on slave -> rsync workspace back to master?
Any proof of concept or real life solutions?
When Hudson runs a build on a slave node, it does a checkout from source control on that node. If you want to copy other files over from the master node, or copy other items back to the master node after a build, you can use the Copy to Slave plugin.
It's surely a late answer, but may help others.
I'm currently using the "Copy Artifact plug-in" with great results.
http://wiki.hudson-ci.org/display/HUDSON/Copy+Artifact+Plugin
(https://stackoverflow.com/a/4135171/2040743)
Just one way of doing things, others exist.
Workspaces are actually not shared when distributed to multiple machines, as they exist as directories in each of the multiple machines. To solve the coordination of items, any item that needs distributed from one workspace to another is copied into a central repository via SCP.
This means that sometimes I have a task which needs to wait on the items landing in the central repository. To fix this, I have the task run a shell script which polls the repository via SCP for the presence of the needed items, and it errors out if the items aren't available after five minutes.
The only downside to this is that you need to pass around a parameter (build number) to keep the builds on the same page, preventing one build from picking up a previous version built artifact. That and you have to set up a lot of SSH keys to avoid the need to pass a password in when running the SSH scripts.
Like I said, not the ideal solution, but I find it is more stable than the ssh artifact grabbing code for my particular release of Hudson (and my set of SSH servers).
One downside, the SSH servers in most Linux machines seem to really lack performance. A solution like mine tends to swamp your SSH server with a lot of connections coming in at about the same time. If you find the same happens with you, you can add timer delays (easy, imperfect solution) or you can rebuild the SSH server with high-performance patches. One day I hope that the high-performance patches make their way into the SSH server base code, provided that they don't negatively impact the SSH server security.

Where do you keep the configuration files for your stack?

For the website(s) I am a developer for we have a number of different technologies which make up our stack, each with a different set of configurations etc.
This is a Rails stack, so we're running things including:
Nginx w/ Passenger
Varnish
Redis
Memcached
MySQL
MongoDB
As we're continually tweaking our configs and changing them to support our continually changing system, and if we were to 'lose' the configurations (e.g. due to a server crash or otherwise) it would be a huge pain to rebuild from memory.
Given that version control would be extremely useful I can quite easily add these files into a Git repo or similar and store them in the cloud somewhere, but what about application-specific configuration (for example, URL Rewrite config for a website on a shared server)? Should these be in this same repo as well?
Put website specific stuff in the Git repo of that website, and system-wide stuff in a "systems" git repo.
If you are not currently using Source Control (of any kind) in your development environment, stop whatever you are doing and sort that out right now. That is the most important aspect of your setup.
At a very minimum you should keep EVERYTHING that is a text file and relates to your app (yes all config files, URL rewrites).
Others suggest you can put binary files also, but at the very minimum all source code, all config etc should be in source control.
By the end of the day :)