At work we're moving from no SCM to Mercurial. It's a bit of a learning curve, but after messing with it for two days I definitely feel more comfortable with it.
I still have one big, unresolved question though in my mind: Once code is finished, how do we handle the actual deployment?
Should we be running a copy of Mercurial on the production (live) server? Or should we set rsync or something up to sync from the repo to the web directory? What's the best practice here?
If we do go w/ just pointing apache to the repo, I assume this is okay as long as we're careful not to hg update to a different, non-stable branch? That still seems a little dangerous to me though. Is there some way to force it to only switch to certain builds?
Or is pointing apache to the repo just a terrible idea and I should be doing something else instead?
On a related topic, I've also heard some talk about putting any upgrade scripts (such as schema changes for MySQL) under version control so they can be ran when the version is deployed. But how would that even work as part of the workflow? I wouldn't want to keep it w/ everything else, because it's a temporary one-time use script...
Thanks for any advice you guys can give.
I recently discovered the hg archive command, so I think we'll go w/ this instead. I've written a bash script that changes to the head of the 'production' branch then archives it to a predetermined destination. Seems to work.
I'd still appreciate any feedback you guys have as to whether this is a good idea or not.
I think pointing apache to the repo is definitely a bad idea, hg archive is ok if all you want is to take a snapshot of the dev files.
I find my development source files and a deployed application (even for a web app that doesn't need compiling) are usually very different, the latter being derived from a subset of the former.
I tend to use a shell script or a even a Makefile to "build" a deployed application in a subdirectory of the development directory, this could just be creating a directory tree and copying necessary files or could include compressing scripts etc.
This way you have to make a conscious decision whether or not to include a file in the deployed version, thus helping prevent accidentally leaving development utility files in an online application that could cause a security risk.
The only part mercurial plays is, for a major release I create a new named branch (eg: 1.5), development continues on the default branch. Subsequent bug fixes or patches can be transplanted to the release branch if necessary and if a bug fix release is made I tag the release branch with the new version (eg: 1.5.1).
Related
I am an applied mathematician and I have recently joined a project that involves the development of production code for our scientific application. The code base is not small and it's deployed as part of a web application.
When I joined, the code was miraculously maintained without a revision control system. There was a central folder in a server and researchers would copy from it when they needed to work with the code. Inside this root directory there was a set of directories with different versions of the code, so people would start working on the latest version they found and create a new one with their modifications.
I created a Mercurial repository, added all code versions to it and convinced everyone to use it. However, since moving to Mercurial, we have felt little if any need to upgrade version numbers, even tough using hg copy allows us to keep revision history.
Here's where I need your advice on best practices of maintaining this code base. Does it make sense under a RCS to keep folders with different versions in a repo? If we keep a single copy of our code in the repo, what's the most common way to track versions? The README files? Should we keep snapshots of the code outside the repo specifying versions? Does it make sense to keep things as they are? What strategies do you use?
Our team is a bunch of scientists and no one has experience on how to maintain such a repo, so I'm interested in what is commonly done.
If you are going to use a version control system, forget about those version folders. Completly. Mercurial will do that for you, the repository is a complete history of all files of the project.
A common way to track version numbers is with tags. You assign a tag with the version number to a changeset.
To help you, as a "getting started guide" in version control, I suggest this book: Version Control By Example. It's free, and it starts from the beginning, it talks about CVCS, DVCS, fundamentals, what a repository is, basic commands, etc. It has also some interesting analogies, like the 3D file system: Directories x Files x Time. The book is fun and easy to understand, I highly recommend it.
I also recommend some GUI software like TortoiseHg. In daily usage, I spend most of the time in the console, but the GUI is very handy specially in the beginning when you still don't know all the commands. And the best part is the graph, you have a visual feedback of what is going on.
This is a good and quick introduction to Mercurial, it even starts out by talking about how using folders to keep different versions is not so great.
I think you're probably on the wrong track if you are using the hg copy command, I've never needed it ;)
The tutorial teaches the command line version of hg, which I personally prefer. When you need a better overview of your repository, you can run "hg serve" and open localhost:8000 in your web browser. I prefer that over TortoiseHG, but I realize that many people want a pure GUI tool.
I use Mercurial in a single-user workflow to have the option to roll back changes if my coding or writing goes horribly wrong (I primarily use the Stata and R statistics packages and LaTeX). While working only locally, this has been easy since all I have is the main repo.
Recently I have started ssh-ing into a Linux server for more computational power. So far I have been manually copying files back and forth and using Mercurial only locally, but I would like to use Mercurial to take care of this and keep these two workflows synchronized. Also, I like the ability to code both locally (on my laptop or desktop) and on the server.
Do I need to work on a clone of the main repo on the server and keep the main repo untouched? Or can I work directly in the main repo when I am on the server? In this question #gizmo points to this workflow guide; the "single developer" discussion is helpful, but it's still not clear to me that I can work in the main repo while I'm on the server without causing some major problem that I don't yet understand.
Thanks!
Edit: I should add that I have worked through Joel Spolsky's HgInit.com tutorial and I'm comfortable pushing/pulling/cloning/etc over ssh, but I am still not sure if I can work in the main repo without causing heartache later. Or maybe this is more a philosophical question? Thanks!
Mercurial is DVCS, it means - in each location you have both: local working copy and local repository
Mercurial is DVCS, it means - you can freely exchange (pull|push) data between repos (if they provide remote-access methods).
If you
comfortable pushing/pulling/cloning/etc over ssh
and don't forget perform pull|push cycle around your work at home (in order to don't run hg serve at home-host and sync from server as source) you don't get any headache at all with perfect linear aggregated history on each place. And even you forget to sync repo sometimes, you get in worst case two heads later, which you'll be able to merge easy (doesn't know formats of Stata and R data-files, but LaTeX, as text, is mergeable)
There is no problem with working directly in the repository on your server. From Mercurial's point of view, the "main" repository is just another random repository — Mercurial doesn't consider it to be special.
You don't say this directly, but one thing that people ask is "What happens when I push to the server?" The answer is that hg push only sends data into the repository (the .hg/ folder). The working copy is not touched on the server when you push to it. Since you push new changesets to the server, you might need to run hg update the next time you work on the server. This is just like if you had run hg pull on the server — there you'll also merge or update afterwards.
I have this situation all the time: I create a repository at home and clone it to my computer at work. I change files in either location and push/pull between the two repositories. If I need to share my work with others, then I make a repository at Bitbucket and push the code there. That way Bitbucket serves as a nice canonical repository for the code and I typically change the default path to Bitbucket in the repositories at home and at work. So at home I would have:
[paths]
default = httsp://bitbucket.org/mg/<repo>/
work = ssh://mg#work/<repo>
so that I can do hg push to send things to Bitbucket and hg pull work to grab things directly from work (in case I forgot to push to Bitbucket before leaving).
I have two ScrewTurn wiki documentation sites that are used for our system and user documentation. My idea is to create a Mercurial repository in each wiki site root directory. Then on a daily basis have a scheduled process add new files and commit changes to the repository and push the changeset to a backup repository.
I realize that, by default, ScrewTurn creates copies of all changed files and therefore has its own change tracking but I am considering turning that behavior off.
I beleive this would give me better version control than the default behavior and an automated backup.
Are there some considerations that I am missing? Is this a good idea? A bad idea?
I don't know anything about screwturn, but so long as its files are stored as text and you can disable revision tracking then mercurial backups are a fine option. You'll of course only have access to revisions that existed at the time of your cron job, but that also means you won't ever lose more than 24hs editing work.
Incidentally, mpm, mercurial's primary author, has talked about using DVCS systems as the backends for wiki systems in the past and was generally not-in-favor of the idea. If I recall correctly his logic was that using a datastore that acquires a global lock for something that's changed only a page at a time doesn't make much sense. However, that would only apply if you were committing after each change; your plan to commit nightly doesn't have that problem.
Alternately, I'm a big fan of rdiff-backup, which does space-efficient nightly snapshots in a disk-browsable manner.
I'm using Mercurial with TortoiseHg. Each developer has their own repositories, and there's one central repository on the server for synchronizing our changes. (This will sound lame, but we're using it to manage the source for a legacy VB6 project. Nothing we can do about that...)
As has been pointed out elsewhere, there is a big problem in VB6 with merging the .frx (form resources) files. So code changes seem to merge fine, but if two developers both make changes at the same time in the form design view, we can't merge.
I'm ok with disallowing concurrent edits, but of course the whole point of Mercurial is that it's distributed so there is no option to force a file to be locked before editing. I don't believe there's a Mercurial solution for this, so I'm wondering: other developers who are using Mercurial for version control, do you have some 3rd party tool that assists with locking files for editing in the cases where it's necessary? Did we make a mistake using Mercurial instead of something like SVN?
Heard of some people using a standalone lock-server (this one in particular).
This is from Bryan O'Sullivan's book on Mercurial:
There is no single revision control
tool that is best in all situations.
As an example, Subversion is a good
choice for working with frequently
edited binary files, due to its
centralised nature and support for
file locking.
Im in the process of trying to get my head round a dvcs such as mercurial. Im getting quite confused with certain points though. Firstly, a bit of context:
At the minute i mostly use subversion, and it works fine for my workflow,
Mostly the repository is for my own use, im the only web developer,and i only ever submit raw code to my manager, he never has to see the repository.
I use the repo to create major versions, and as backup so i can revert to it when something doesnt work out.
The repo also acts a file share, enabling me to work from the same codebase at work and at home.
My main reason for wanting to switch to mercurial, is the offline commits and easier branching / merging.
Firstly can anyone tell me how i would get mercurial to fit this workflow?
How do i go about sharing multiple repositories (i.e. one for each project) between computers?
Any help would be hugely appreciated,
Thanks
http://hginit.com/
There is a fantastic pre-chapter there specifically for SVN users. The rest of the tutorial will get you on your feet fairly quickly.
I'll answer just one part of your question, that of how to manage access to your repository from both home and work, because this is one of the situations where distributed version control is really useful.
The answer is that your two repositories are clones of one-another (to be correct, one is the clone of the other). You do some work during the day, check it in, then pull that work to your home repository (or push, but that requires more work). The next morning, you do the same thing in reverse. Mercurial comes with a built-in read-only HTTP server that makes it really easy, provided that you can expose a port.
The end result is that you have two repositories (ie, automatic backup of the entire history). At any given point in time, one is "better" than the other, but since you're the sole committer to both, they won't diverge.