Secure Repositories by User in Mercurial - mercurial

I have followed the many helpful ideas presented in this SO question. Now the last thing I'm wrestling with is how to allow certain people to access and view the contents of certain repositories. I want to have a central Repos folder on my machine, where all Hg Repos will live, but I would like to say that Person X can see Repos A, B and C while Person Y can only see A & C. I have not been able to find the answer to this question and I hoped that someone on SO could assist me. I can control the push/pull, but I haven't seen a way to actually prevent repos from being visible in the hgwebdir CGI application.

Use the ACL Extension (distributed with Mercurial).

Related

How to work without access to subrepository in Mercurial

There is a project repository A that uses framework subrepository B. Internal team has access to both. Is it possible to limit access to B for external team without breaking possibility to work with A?
Currently it says "abort: response expected (in subrepo ...)" when cancel password entering during cloning.
Or maybe there is another way for collaboration with different access rules?
Thanks in advance!
Sorry, not possible. Actions on the "parent" repo require at least read access to the "child" repo. It's probably a lot of work at this point, but one thing to consider is making them both "sibling" repositories of an "outer" umbrella repo. That setup would look like:
UMBRELLA
PROJECT
FRAMEWORK
Then people w/ access to both can checkout UMBRELLA and have both in known, predictable locations and can still do commits across both from a single location. Those that can access PROJECT but not FRAMEWORK would check out only PROJECT.

Mercurial: Incomplete central repository possible?

I want to realize the following setup:
AtWork:MercurialRepo <-> Internet:MercurialRepo <-> AtHome:MercurialRepo
Problem is the repository is several gigs. I already have the entire repo at home (through bundling->cdrom->unbundling). The thing is, I do not want to store the whole repository on the internet. Is there a way to temporarily exclude folders from versioning in order to push/pull only a subset of the repo I am working on through the internet? How do I best accomplish my goal? From time to time I would need to do the tedious bundling -> cdrom -> unbundling route, just to update everything else, but in general I do want to use the internet route and do not want to store the whole repo there.
So, as you've found out by now you can't selectively clone some files from a repository. The best you can do is clone a subset of all branches; but you will get the entire past history of these branches, for all files in the repository. So, unless a lot of the big files are only known in some branches and not others, this won't help you.
Since your problem is the large size of files (rather than a long and bulky history), you probably need to break it down into several "subrepositories" of manageable size. Note that the subset you are interested in cloning must be a subrepository; cloning the main repo necessarily includes the subrepositories. The mercurial subrepository documentation recommends that you make a trivial ("thin shell") main repo, and put all your project code in subrepositories.
Subrepositories are a complex solution, and are considered a "feature of last resort" by the mercurial team. It's a complex setup, there are various limitations (see the docs), and you'll have the extra complication of trying to convert your repo in a way that will preserve file history. So, it's worth considering ways to avoid this:
a) It would be best if you can avoid the middle copy of your repo; is there no way you can set up ssh access or a proxy so that your home repo can talk to your work repo directly? (Or vice versa; it's enough if one of the locations is able to contact the other).
b) You could carry the repo on a USB stick, as #vaclav's answer suggests.
c) Or maybe you should just bite the bullet and clone the entire repo on the internet.
Is there a way to temporarily exclude folders from versioning in order to push/pull only a subset of the repo I am working on through the internet?
Not folders, but some parts of repo - yes
You can push -b (only some branch(es)) or push -r (revision with ancestors: for latest work it will be -r tip), but final size of transfer is heavy dependent from type of your DAG - in case of a lot of cross-branch merges you probably skip only small part of changesets
I have small idea, bit different from what you asked, but...
If I have same issue, I would thing of using usb flash as whole repository (if you are about 10 or 20 gig it should be cheap). So at work you can copy, or clone whole repo to usb, pull new changes from it at home, and after your home working is done, push it to repo on flash, then pull it to repo at work(I use even temporary commits for undone work which I revert to working directory and strip, so I can continue where I ended).
But definitely easiest way, is to try get some connection to work servers, or to your machine at work. Or get bigger space for repo at internet. So, just another Ideat. HTH
Is not really possible. The closest thing would be to use sub-repositories which will effectively allow you to have only part of your big repo on the net.

Mercurial Repository Architecture for Code Reviews

We are in the process of moving to Mercurial from Clearcase (for version control) and to Jira/Crucible from ClearQuest (for issue tracking and code reviews). We perform mandatory pre-push reviews.
We have encountered a problem with Crucible and pre-push support, and we are looking for several solutions. The main way to resolve the problem is to make Atlassian products "watch" as least amount of repositories as possible (the issue we encountered is slowness that is directly linked to the amount of repositories watched).
What we do now is watch every single development repository to allow us to perform code reviews on them. We also have one central repository that holds a stable version. My question is how to plan our repository architecture so we can perform code reviews and still keep a clean central repository (I guess some sort of review repository is needed, but I can't figure out how to get it to work for several reviews at once).
We do pre-push reviews the easy way: we use patches instead of having development repositories on a central server.
Only if we need to build something big, we create a development/feature repository on the server, but even then, we still review patches before pushing to those repos.
To enforce this, you need assign roles for pushing to the repos, instead of allowing all development team to push.

How do large companies deal with Mercurial?

I am investigating how to migrate our source control from SVN to Mercurial. One thing I am not sure how to deal with is usernames in commits. From what I've seen, there is no way to force an HG user to use a specific username, even if specified in Mercurial.ini, the user can override it in commits with the -u flag in hg commit.
How do companies deal with this? there is nothing to prevent developer A to commit something in his repository as developer B, and then pushing it to someone else.
Thanks.
I wouldn't say our company is large (4 developers), but it's never been an issue for us so far. I haven't seen any way to prevent that behavior either in my searching. I guess it comes down to an issue of trust amongst your developers.
Unrelated, we did successfully migrate from SVN to Mercurial about two years ago so I may be able to answer other questions you have.
EDIT: An idea:
I'm not sure how you were planning on setting up your topology, but we have a server that functions as the central repository for all our repos. It is possible to push changes between developers (bypassing the central server), but we never do that. We always commit locally and then push/pull from/to the central server. Additionally, we use https and windows authentication to authenticate with this central server.
If you're planning on having something like this, you could create a hook on the server (see repository events) (maybe the precommit event) that would verify that the user name in each commit being pushed is the same as the authenticated user from the web server.
Not sure if this would work, but it sounds plausable.
Another attempt(s)
Path-based ACLs in pseudo-CVCS workflow
If you'll use "controlled anarchy" workflow (p2p communications aren't controlled, resticted AND trusted and single authoritative source is common push-target), you can use "Branch Per Developer" paradigm. I.e - with ACL extension on central repo the following restrictions apply:
Nobody can push to default branch
Each developer can push only in his personal branch (under any name, name means nothing, auth-data for tracking is branch-name)
Only trusted mergers can work with repo-Central (merge dev-branches to default, NO rebase|NO history rewrite in dev-branches)
Each mergeset in default branch contain authentication piece - source branch
Signing branches
If you can't trust (and you must not trust) username in commits, you can trust strong crypto. Mercurial have at least two extensions, which allow digitally sign commits, thus providing accurate (so-so, see notes below) information about the authorship with own advantages and disadvantages in both cases
Commitsigs Extension Wiki and Signing Mercurial Changesets on Windows mini-HowTo are complete enough to understand and demonstrate all aspects of the start. Pro: no additional commits for signing, you can't (by design) sign old historic commits. Contra: not-so-nice output of needed commands (see screenshots in Damian's post for log and verifysigs), because it's GnuPG (no PKI), theoretically it's possible to create and use key-pair for any name-email and only "extra" comparison will show two different keys for one user
GPG extension and Approval Reports from wiki as quick-start. Pro: can use pgp-keys or openssl-certs (TBT!!!) (where openssl means one corporate source of issued certs), more readable and informative output of sigcheck command. Contra:
commiting changes to a .hgsigs file in the root of the working copy
and so it requires extra changesets to be made. This makes it
infeasible to sign all changesets. The .hgsigs file must also be
merged like any other file when branches are merged.
and at last file can be modified by hand by malicious user as any other file in WC
Edit and bugfixing
Openssl can be used in Commitsigs, not GPG extension

Mercurial sub-repositories

I read the tutorial many times and I feel that I am still missing something.
I'll just try to give a concrete scenario. Please help me find where I'm
wrong.
Suppose I have a repository which everyone considers as "central". This
means that every new developer clones from it and pull/push from/to it.
Central contains three folders-
Infra (which is about to be a shared code)
.hg
infra.txt
dev1
dev1.txt
.hgsub (in which there's a line --> infra = (path of infra) )
infra (subrepo)
.hg
infra.txt
dev2
dev2.txt
.hgsub (the same as in dev 1 - infra = (path to infra) )
infra (subrepo)
.hg
infra.txt
Now, suppose that one developer clones dev1, and another one clones dev2.
What I see is that when the developer of dev1 changes infra and pushes the
changes to the repository in central, the only way of dev2 developer to know
about the change in infra is to manually search for incoming change-sets in
infra as a sub-repository. Generally, It means that if my project has many
sub-repositories (that may themselves contain some more sub-repositories) ,
I have no way to know about the changes except for going over my
sub-repositories manually.
I think that's not the way to work...
Can anyone help?
Thanks in advance,
Eyal
I think I have found something better.
You can use --subrepos flag when checking for incoming change-sets in a repository.
This will search for incoming change-sets recursively, and show us the sub-repositories in which change-sets can be pulled.
This way, one can control on which sub-repositories are changed, and whether she wants to get up-to date files in those sub-repositories.
You are going to have to pull for each repository. You might think this tedious but there's no way mercurial is going to make the decision to pull changes into your repository for you - this is a good thing.
What you can do is create a simple batch script that runs a 'hg pull' command against each repository. That at least automates the process so it feels less tedious when you really want to pull from all repos.
We moved all our subrepos into one repository which makes it much simpler to manager a change/new feature that requires alterations to all our libraries.
I like subrepos but I think they are best suited for pulling in entire repositories that others look after that remain pretty stable. When there's a lot of changes, you need a lot of discipline and a certain amount of scripting to keep manual work down to a minimum.