disable/deprecate (but not delete) mercurial repository - mercurial

My colleagues and I use several repositories that live on a centralized (ubuntu, if it matters) server. For several of our projects, we include the same 4 repositories as subrepositories in lots of different projects. This can be a little irritating to constantly be pushing/pulling/merging 4 different subrepositories across lots of different projects because it takes a while to transfer everything over the network, etc.
I would like to combine these 4 repositories into one master repository that can be included in all of our projects going forward. The challenge is that I do not want to delete the old subrepositories as that would break existing projects that are working just fine.
It would be great if there were a way to designate these old repositories as deprecated or, at the very least, make it impossible for my colleagues to push any new changesets to these repositories and display a helpful error message. Is this possible, perhaps with mercurial hooks as this tangential Q/A suggests?

I wasn't terribly inclined to manipulate permissions with the filesystem because that does not provide a very useful error message to my colleagues. I was able to accomplish this (in mercurial 2.3) with the prechangeset hook. First, create a file (.hg/deprecated.py, in this case) to store the hook in the shared repository you wish to deprecate:
# .hg/deprecated.py
import sys
import textwrap
# print out a helpful error message in red to make it obvious things
# are not working
msg = "ERROR: Pushing changesets into this repository is no longer supported. "
msg += "This package has been merged into the /path/to/new/repo repository."
print('\033[%im%s\033[0m'%(31, textwrap.fill(msg)))
# return a non-zero exit code to disallow the changeset to be added to the
# target repository
sys.exit(1)
Then tell mercurial to execute this hook before any changeset is added to the repository by adding the following to your .hg/hgrc file:
# .hg/hgrc
[hooks]
prechangegroup.deprecate = python .hg/deprecate.py
This solution simultaneously alerts the coder that the repository is not active, instructs the coder where the changes should be committed, and prevents the coder from committing code to a deprecated repository. While this isn't as permanent solution as manipulating filesystem permissions, it does instruct people where to find the new repository. Hope someone else finds this useful!

If you are using something like mercurial-server, you can just revoke all write access to the deprecated repositories, so that the only permitted operations are pulls or updates.

Related

Synchronizing actual version of Mercurial repository for multiple workplaces

I have three different Linux-based working places, each with a different computer. I need to have a repository synchronized to keep coding on the latest version each time I move from a workplace to another. You can always commit and push to, say, bitbucket and then pull from another computer, but this is not the purpose of a commit.
Other similar posts did not help, like Synchronizing a collection of Mercurial repositories.
Any suggestion?
Your two primary options for exchanging temporary work between repositories are Mercurial Queues and the evolve extension.
Mercurial Queues are documented fairly extensively here. To use them for your purpose, you have to put the patches under version control (explained near the bottom of the chapter) and can then push them to/pull them from a shared patch repository. Note that the book is a few years old and Mercurial has added some convenience features in the meantime. These days you can do operations on the patch repository directly via the --mq option (e.g., hg init --mq, hg commit --mq, hg push --mq) and don't need a bash alias for convenience.
Evolve is probably more intuitive; it provides a fairly straightforward approach to shared mutable history. You can commit changes in one repository, push the changes to a shared repository, pull from another and uncommit or alter them, then push them back.
In order to set this up, you need a shared repository somewhere that is declared as non-publishing. You do this by adding the following lines to its .hg/hgrc:
[phases]
publish = False
This prevents changesets exchanged through this repository from becoming public (at which point, they'd become immutable).
You will also need to install the extension first (unlike MQ, which is part of core Mercurial).
Note that Bitbucket currently does not support obsolescence markers, which are crucial for the functioning of changeset evolution, so you will need to host the shared repository in a different place. Evolve functions not by deleting outdated changesets, but by marking them as obsolete and hiding them (obsolescence markers also track how old and new changesets are related). Because Bitbucket does not support these markers, obsolete changesets will become visible again if pushed there. (Note that you can still use evolve locally or between evolve-aware repositories and use Bitbucket for public stuff.)
Slightly different ways:
Handwork
MQ with MQCollab extension
Commits with "classic" exchange between repos using MuliRepo extension (just don't forget hg pull on every workplace before pull - and add all remote repos into [multirepo] section on each workplace)
Automated way
Create additional "central hub" and use AutoSync extension

Can I work in the repository in a single user Mercurial workflow?

I use Mercurial in a single-user workflow to have the option to roll back changes if my coding or writing goes horribly wrong (I primarily use the Stata and R statistics packages and LaTeX). While working only locally, this has been easy since all I have is the main repo.
Recently I have started ssh-ing into a Linux server for more computational power. So far I have been manually copying files back and forth and using Mercurial only locally, but I would like to use Mercurial to take care of this and keep these two workflows synchronized. Also, I like the ability to code both locally (on my laptop or desktop) and on the server.
Do I need to work on a clone of the main repo on the server and keep the main repo untouched? Or can I work directly in the main repo when I am on the server? In this question #gizmo points to this workflow guide; the "single developer" discussion is helpful, but it's still not clear to me that I can work in the main repo while I'm on the server without causing some major problem that I don't yet understand.
Thanks!
Edit: I should add that I have worked through Joel Spolsky's HgInit.com tutorial and I'm comfortable pushing/pulling/cloning/etc over ssh, but I am still not sure if I can work in the main repo without causing heartache later. Or maybe this is more a philosophical question? Thanks!
Mercurial is DVCS, it means - in each location you have both: local working copy and local repository
Mercurial is DVCS, it means - you can freely exchange (pull|push) data between repos (if they provide remote-access methods).
If you
comfortable pushing/pulling/cloning/etc over ssh
and don't forget perform pull|push cycle around your work at home (in order to don't run hg serve at home-host and sync from server as source) you don't get any headache at all with perfect linear aggregated history on each place. And even you forget to sync repo sometimes, you get in worst case two heads later, which you'll be able to merge easy (doesn't know formats of Stata and R data-files, but LaTeX, as text, is mergeable)
There is no problem with working directly in the repository on your server. From Mercurial's point of view, the "main" repository is just another random repository — Mercurial doesn't consider it to be special.
You don't say this directly, but one thing that people ask is "What happens when I push to the server?" The answer is that hg push only sends data into the repository (the .hg/ folder). The working copy is not touched on the server when you push to it. Since you push new changesets to the server, you might need to run hg update the next time you work on the server. This is just like if you had run hg pull on the server — there you'll also merge or update afterwards.
I have this situation all the time: I create a repository at home and clone it to my computer at work. I change files in either location and push/pull between the two repositories. If I need to share my work with others, then I make a repository at Bitbucket and push the code there. That way Bitbucket serves as a nice canonical repository for the code and I typically change the default path to Bitbucket in the repositories at home and at work. So at home I would have:
[paths]
default = httsp://bitbucket.org/mg/<repo>/
work = ssh://mg#work/<repo>
so that I can do hg push to send things to Bitbucket and hg pull work to grab things directly from work (in case I forgot to push to Bitbucket before leaving).

Cleaning up a mercurial repository for an external contractor

I have an active project with some sensitive files and directories. I want to hire an external contractor to do some simple UI work. However, I don't want the contractor to have access to some directories and files. My project is in mercurial on Bitbucket.
What is the best way to clean up the project and give him access to commit his changes? I thought about forking into a new repository, but I am worried about removing directories I don't want him to have access to.
How to I remove them so they don't appear the original changesets? How to I merge his repo back without it removing those directories in my main repository? Is a fork the way to go?
Naturally a repository needs access to its whole history in order to self-check its integrity. I don't know of a way to selectively hide parts of the repository (there's the ACL extension, but it is for write access only).
In your case, I would
create a new repository where all sensitive information has been stripped off (use the convert extension for that task).
Then I would let the external guy work with that repository.
Once his work is finsihed, pull his repository into a clone of the original one (using -f to force pulling of an unrelated repository), and
rebase his first changeset and all its children onto a head of your original repository.
Finally, push the rebased head to the original repository.
For steps 3 to 5 you don't necessarily have to wait until the external developer is done. Rebasing intermediate states of his repository is also possible.
Yet, it's an theoretical idea .. one has to see how it performs in practice.
Alternative: In case you frequently have external contractors who shouldn't see some parts of your code, I would second #Anton's comment to setup permission related multiple repositories.
There are multiple ways to do this:
Using sub-repositories
Using multiple repositories
???
Regardless, you need to restructure and split your existing repository, so this will create havoc if you have lots of people working on this project, they will all need to stop working, synchronize their work, destroy their local clones and clone down fresh copies after the restructuring.
One way using multiple repositories would be that you do the following:
Make 2 extra clones of the repository (keep one around for fallback if everything fails, you can always go back)
The first clone you need to run the hg convert command on to get rid of all the bits and pieces your contractor should not access
Then you fix that repository so that it works by itself. You might have to change code to provide hooks and events for anything not present but which you intend to inject into the project before you build
Then you need to run hg convert on the other clone to get rid of everything now present in the first.
Then you pull from the first (contractor) repository into the second (private) repository, merge, and do necessary fix-ups so that the code still works as intended
What you have now is two repositories:
Contractor-repository, with only the bits you want to expose
A private repository, that has pulled and merged from the contractor-repository, and contains all the other bits and pieces
From now on, whenever the contractor has pushed work to his repository, you need to pull from it and into the private repository and then merge.
Your repositories would look like this:
Contractor: ---97---98---99---100---102---103---104
M M
Private: ---91---92---93---94---95---96---101---105---106---107
/ /
/ /
---97---98---99---100---102---103---104
The two changesets with M above are merge-changesets that merge contractor-supplied code into your private repository.
Note that you too would have to commit code to the contractor-repository, to work on and fix bugs in the code there, but all the private bits you can keep private.

What's the best way to track private files in a public Mercurial repository?

"If it’s not in source control, it doesn’t exist."
This question was addressed for Git here: Techniques to handle a private and public repository?. What about for Mercurial?
I have several public Bitbucket repos (with multiple committers) where I'd like the source to be public, but which load API, SSH keys and other sensitive info from untracked files. However this results in someone emailing around the new config file if we add a new Mailchimp or Hunch or Twilio API key. Is there a way to shield these files from public view somehow and still track them? Everyone is syncing their repo through Bitbucket.
There are two good ways to handle this (besides zerkms's solution, which doesn't offer the easy of synchronization you want, but is what I'd do anyway):
Use Mercurial Queues. When you create a mercurial queue with hg qinit --create-repo it creates an overlay system that can be qpushed atop the existing repo. So you keep your secrets in queues and qpush them when you need them, and qpop them when you don't. With --create-repo the set of overlays (patches) is handled in a repository of its own. So people in the know can push/pull the secret overlay repo and people w/o access to it can use the base repo. The patch repo can be a private repo on bitbucket or hosted elsewhere.
or
Use a subrepo exactly as described in the git solution.
Create filename.ext.sample files with templates inside (probably filled with dummy data), which need to be copied and filled with actual data in the particular working directory.
That is what I usually do ;-)
Zerkms' solution is fast, easy, and clean, and likely your best bet for preventing secure content from being tracked / published; however as you say, "If it’s not in source control, it doesn’t exist." I find that far more often what I'm trying to keep out of source control is not a security concern, but simply a configuration setting. I believe these should be tracked, and my current employer has a rather clever setup for dealing with this, which I'll attempt to simplify / generalize / summarize here.
REPOSITORY
code/
...
scripts/
configparse.sh
...
config/
common.conf
env/
development.conf
testing.conf
production.conf
users/
dimo414.conf
mycoworker.conf
...
hosts/
dimo414-laptop.conf
dimo414-server.conf
mycoworker-laptop.conf
...
local.conf*
makefile
.conf*
* untracked file
Hopefully the idea here is pretty clear, we define settings at each appropriate level, enabling highly granular control of the codebase's behavior in a logical and consistent fashion.
The scripts/configparse.sh script reads all the necessary configuration files in turn and builds .conf out of all the settings it finds.
config/common.conf is the starting point, and contains logical default values for every setting. Many will likely get overwritten, but something is specified here. It's an error for a setting to be found in another file that isn't first set in common.conf.
config/env/ controls the behavior in different environments, doing things like pointing to the correct database servers.
config/users/ looks for a $USER.conf file, useful for setting things I care about, such as increasing the logging level for aspects my team works on, or customizing behavior I prefer to use across all my machines.
config/hosts does the same for machines, looking for $HOSTNAME.conf. Useful for machine-specific settings like application paths or data directories.
config/local.conf is an untracked file, and lets you set checkout-specific values and/or content you don't want in version control.
The aggregate of all these settings is output to .conf, which is what the rest of the codebase looks for when loading settings.

How do I configure Mercurial to not commit specific config files?

My team is switching to Mercurial. Our projects all have a config file (web.config or app.config, and a few bat files as well - we are a C# shop). These files need to be part of the repository. When a developer clones the repository, local changes are needed to their config files to get them working. For example, a project's config file may need a connection string to the developer's database, or other environment-specific info. We don't want these changes ending up in the repository. And from time to time we do make changes to these configs that do need to get into the repository and distributed to the team and eventually the customer.
What is the easiest way for us to configure or use Mercurial so that these files are not getting committed by accident? I would like to be forced to make an explicit commit of such files, yet merges from the repo would automatically come down in updates.
This has to be a problem someone else has faced, but as Mercurial newbies we are all at a loss for the best solution.
Edit:
A similar question that may share some common solutions, but is not the same as this question, can be found at: Conditional Mercurial Ignore File
I am including this in case that other question might provide the answer you are looking for.
The typical way to handle this is to store templates for the configuration files in your repositor, and add the actual configuration files to the ignore list in Mercurial.
This way, you have pristine, unmodified, copies of each configuration files available at all times, even for new developers who clone from scratch, but in order to make the configuration files usable, you need to make a local copy of it to the actual configuration file name, and modify the file. You could also use compare/merge programs, such as Beyond Compare, to compare a new version of the template file with your local copy of an older version, to see what changed, and add in the missing bits.
If you need to hard prevent committing the actual configuration files, you need a pre-commit or pre-push hook that does this.
In your .hg/hgrc file do this:
[defaults]
commit = -X Projectname/web.config
(assuming "ProjectName" is the project subdir)
Edit:
Also, if you're using Tortoise HG - add this as well:
[tortoisehg]
ciexclude = Projectname/Web.config,Projectname/App_Data/DBFile.mdf
(by the way mind the FORWARD slash in folder-path! Even on Windows!)