How to use mercurial subrepos for shared components and dependencies? - mercurial

We develop .NET Enterprise Software in C#. We are looking to improve our version control system. I have used mercurial before and have been experimenting using it at our company. However, since we develop enterprise products we have a big focus on reusable components or modules. I have been attempting to use mercurial's sub-repos to manage components and dependencies but am having some difficulties. Here are the basic requirements for source control/dependency management:
Reusable components
Shared by source (for debugging)
Have dependencies on 3rd party binaries and other reusable components
Can be developed and commited to source control in the context of a consuming product
Dependencies
Products have dependencies on 3rd party binaries and other reusable components
Dependencies have their own dependencies
Developers should be notified of version conflicts in dependencies
Here is the structure in mercurial that I have been using:
A reusable component:
SHARED1_SLN-+-docs
|
+-libs----NLOG
|
+-misc----KEY
|
+-src-----SHARED1-+-proj1
| +-proj2
|
+-tools---NANT
A second reusable component, consuming the first:
SHARED2_SLN-+-docs
|
+-libs--+-SHARED1-+-proj1
| | +-proj2
| |
| +-NLOG
|
+-misc----KEY
|
+-src-----SHARED2-+-proj3
| +-proj4
|
+-tools---NANT
A product that consumes both components:
PROD_SLN----+-docs
|
+-libs--+-SHARED1-+-proj1
| | +-proj2
| |
| +-SHARED2-+-proj3
| | +-proj4
| |
| +-NLOG
|
+-misc----KEY
|
+-src-----prod----+-proj5
| +-proj6
|
+-tools---NANT
Notes
Repos are in CAPS
All child repos are assumed to be subrepos
3rd party (binary) libs and internal (source) components are all subrepos located in the libs folder
3rd party libs are kept in individual mercurial repos so that consuming projects can reference particular versions of the libs (i.e. an old project may reference NLog v1.0, and a newer project may reference NLog v2.0).
All Visual Studio .csproj files are at the 4th level (proj* folders) allowing for relative references to dependencies (i.e. ../../../libs/NLog/NLog.dll for all Visual Studio projects that reference NLog)
All Visual Studio .sln files are at the 2nd level (src folders) so that they are not included when "sharing" a component into a consuming component or product
Developers are free to organize their source files as they see fit, as long as the sources are children of proj* folder of the consuming Visual Studio project (i.e., there can be n children to the proj* folders, containing various sources/resources)
If Bob is developing SHARED2 component and PROD1 product, it is perfectly legal for him to make changes the SHARED2 source (say sources belonging to proj3) within the PROD1_SLN repository and commit those changes. We don't mind if someone develops a library in the context of a consuming project.
Internally developed components (SHARED1 and SHARED2) are generally included by source in consuming project (in Visual Studio adding a reference to a project rather than browsing to a dll reference). This allows for enhanced debugging (stepping into library code), allows Visual Studio to manage when it needs to rebuild projects (when dependencies are modified), and allows the modification of libraries when required (as described in the above note).
Questions
If Bob is working on PROD1 and Alice is working on SHARED1, how can Bob know when Alice commits changes to SHARED1. Currently with Mercurial, Bob is forced to manually pull and update within each subrepo. If he pushes/pulls to the server from PROD_SLN repo, he never knows about updates to subrepos. This is described at Mercurial wiki. How can Bob be notified of updates to subrepos when he pulls the latest of PROD_SLN from the server? Ideally, he should be notified (preferable during the pull) and then have to manually decide which subrepos he wants to updated.
Assume SHARED1 references NLog v1.0 (commit/rev abc in mercurial) and SHARED2 references Nlog v2.0 (commit/rev xyz in mercurial). If Bob is absorbing these two components in PROD1, he should be be made aware of this discrepancy. While technically Visual Studio/.NET would allow 2 assemblies to reference different versions of dependencies, my structure does not allow this because the path to NLog is fixed for all .NET projects that depend on NLog. How can Bob know that two of his dependencies have version conflicts?
If Bob is setting up the repository structure for PROD1 and wants to include SHARED2, how can he know what dependencies are required for SHARED2? With my structure, he would have to manually clone (or browse on the server) the SHARED2_SLN repo and either look in the libs folder, or peak at the .hgsub file to determine what dependencies he needs to include. Ideally this would be automated. If I include SHARED2 in my product, SHARED1 and NLog are auto-magically included too, notifying me if there is version conflict with some other dependency (see question 2 above).
Bigger Questions
Is mercurial the correct solution?
Is there a better mercurial structure?
Is this a valid use for subrepos (i.e. Mercurial developers marked subrepos as a feature of last resort)?
Does it make sense to use mercurial for dependency management? We could use yet another tool for dependency management (maybe an internal NuGet feed?). While this would work well for 3rd party dependencies, it really would create a hassle for internally developed components (i.e. if they are actively developed, developers would have to constantly update the feed, we would have to serve them internally, and it would not allow components to be modified by a consuming project (Note 8 and Question 2).
Do you have better a solution for Enterprise .NET software projects?
References
I have read several SO questions and found this one to be helpful, but the accepted answer suggests using a dedicated tool for dependencies. While I like the features of such a tool it does not allowed for dependencies to be modified and committed from a consuming project (see Bigger Question 4).

This may not be the answer you were looking for, but we have recent experience of novice Mercurial users using sub-repos, and I've been looking for an opportunity to pass on our experience...
In summary, my advice based on experience is: however appealing Mercurial sub-repos may be, do not use them. Instead, find a way to lay out your directories side-by-side, and to adjust your builds to cope with that.
However appealing it seems to be to tie together revisions in the sub-repo with revisions in the parent repo, it just doesn't work in practice.
During all the preparation for the conversion, we received advice from multiple different sources that sub-repos were fragile and not well-implemented - but we went ahead anyway, as we wanted atomic commits between repo and sub-repo. The advice - or my understanding of it - talked more about the principles rather than the practical consequences.
It was only once we went live with Mercurial and a sub-repo, that I really understood the advice properly. Here (from memory) are examples of the sorts of problems we encountered.
Your users will end up fighting the update and merge process.
Some people will update the parent repo and not the sub-repo
Some people will push from the sub-repo, ang .hgsubstate won't get updated.
You will end up "losing" revisions that were made in the sub-repo, because someone will manage to leave the .hgsubstate in an incorrect state after a merge.
Some users will get into the situation where the .hgsubstate has been updated but the sub-repo hasn't, and then you'll get really cryptic error messages, and will spend many hours trying to work out what's going on.
And if you do tagging and branching for releases, the instructions for how to get this right for both parent and sub-repo will be many dozens of lines long. (And I even had a nice, tame Mercurial expert help me write the instructions!)
All of these things are annoying enough in the hands of expert users - but when you are rolling out Mercurial to novice users, they are a real nightmare, and the source of much wasted time.
So, having put in a lot of time to get a conversion with a sub-repo, several weeks later we then converted the sub-repo to a repo. Because we had large amounts of history in the conversion that referred to the sub-repo, via .hgsubstate, it's left us with something much more complicated.
I only wish I'd really appreciated the practical consequences of all the advice much earlier on, e.g. in Mercurial's Features of Last Resort page:
But I need to have managed subprojects!
Again, don't be so sure. Significant projects like Mozilla that have tons of dependencies do just fine without using subrepos. Most smaller projects will almost certainly be better off without using subrepos.
Edit: Thoughts on shell repos
With the disclaimer I don't have any experience of them...
No, I don't think many of them are. You are still using sub-repos, so all the same user issues apply (unless you can provide a wrapper script for every step, of course, to remove the need for humans to supply the correct options to handle sub-repos.)
Also note that the wiki page you quoted does list some specific issues with shell repos:
overly-strict tracking of relationship between project/ and somelib/
impossible to check or push project/ if somelib/ source repo becomes
unavailable lack of well-defined support for recursive diff, log, and
status recursive nature of commit surprising
Edit 2 - do a trial, involving all your users
The point at which we really started realising we had an issue was once multiple users started making commits, and pulling and pushing - including changes to the sub-repo. For us, it was too late in the day to respond to these issues. If we'd known them sooner, we could have responded much more easily and simply.
So at this point, the best advice I think I can offer is to recommend that you do a trial run of the project layout before the layout is carved in stone.
We left the full-scale trial until too late to make changes, and even then people only made changes in the parent repo, and not the sub-repos - so we still didn't see the full picture until too late.
In other words, whatever layout you consider, create a repository structure in that layout, and get lots of people making edits. Try to put enough real code into the various repos/sub-repos so that people can make real edits, even though they will be throw-way ones.
Possible outcomes:
You might find it all works fine - in which case, you'll have spent some time to gain certainty.
On the other hand, you might identify issues much more quickly than spending time trying to work out what the outcomes would be
And your users will learn a lot too.

Question 1:
This command, when executed in the parent "shell" repo will traverse all subrepos and list changesets on from the default pull location that are not present:
hg incoming --subrepos
The same thing can be accomplished by clicking on the "Incoming" button on the "Synchronize" pane in TortoiseHg if you have the "--subrepos" option checked (on the same pane).
Thanks to the users in the mercurial IRC channel for helping here.
Questions 2 & 3:
First I need to modify my repo structures so that the parent repos are truly "shell" repos as recommended on the hg wiki. I will take this to the extreme and say that the shell should contain no content, only subrepos as children. In summary, rename src to main, move docs into the subrepo under main, and change the prod folder to a subrepo.
SHARED1_SLN:
SHARED1_SLN-+-libs----NLOG
|
+-misc----KEY
|
+-main----SHARED1-+-docs
| +-proj1
| +-proj2
|
+-tools---NANT
SHARED2_SLN:
SHARED2_SLN-+-libs--+-SHARED1-+-docs
| | +-proj1
| | +-proj2
| |
| +-NLOG
|
+-misc----KEY
|
+-main----SHARED2-+-docs
| +-proj3
| +-proj4
|
+-tools---NANT
PROD_SLN:
PROD_SLN----+-libs--+-SHARED1-+-docs
| | +-proj2
| | +-proj2
| |
| +-SHARED2-+-docs
| | +-proj3
| | +-proj4
| |
| +-NLOG
|
+-misc----KEY
|
+-main----PROD----+-docs
| +-proj5
| +-proj6
|
+-tools---NANT
All shared libs and products have there own repo (SHARED1, SHARED2, and PROD).
If you need to work on a shared lib or product independently, there is a shell available (my repos ending with _SLN) that uses hg to manage the revisions of the dependencies. The shell is only for convenience because it contains no content, only subrepos.
When rolling a release of a shared lib or product, the developer should list the all of the dependencies and their hg revs/changesets (or preferably human friendly tags) that were used to create the release. This list should be saved in a file in the repo for the lib or product (SHARED1, SHARED2, or PROD), not the shell. See Note A below for how this could solve Questions 2 & 3.
If I roll a release of a shared lib or product I should put matching tags in the in the projects repo and it's shell for convenience, however, if the shell gets out of whack (a concern expressed from real experience in #Clare 's answer), it really should not matter because the shell itself is dumb and contains no content.
Visual Studio sln files go into the root of the shared lib or product's repo (SHARED1, SHARED2, or PROD), again, not the shell. The result being if I include SHARED1 in PROD, I may end up with some extra solutions that I never open, but it doesn't matter. Furthermore, if I really want to work on SHARED1 and run it's unit tests (while working in PROD_SLN shell), it's really easy, just open the said solution.
Note A:
In regards to point 3 above, if the dependency file use a format similar to .hgsub but with the addition of the rev/changeset/tag, then getting the dependencies could be automated. For example, I want SHARED1 in my new product. Clone SHARED1 to my libs folder and update to the tip or the last release label. Now, I need to look at the dependencies file and a) clone the dependency to the correct location and b) update to the specified rev/changeset/tag. Very feasible to automate this. To take it further, it could even track the rev/changeset/tag and alert the developer of there is dependency conflict between shared libs.
A hole remains if Alice is actively developing SHARED1 while Bob is developing PROD. If Alice updates SHARED1_SLN to use NLog v3.0, Bob may not ever know this. If Alice updates her dependency file to reflect the change then Bob does have the info, he just has to be made aware of the change.
Bigger Questions 1 & 4:
I believe that this is a source control issue and not a something that can be solved with a dependency management tool since they generally work with binaries and only get dependencies (don't allow committing changes back to the dependencies). My dependency problems are not unique to Mercurial. From my experience, all source control tools have the same problem. One solution in SVN would be to just use svn:externals (or svn copies) and recursively have every component include its dependencies, creating a possibly huge tree to build a product. However, this falls apart in Visual Studio where I really only want to include one instance of a shared project and reference it everywhere. As implied by #Clare 's answer and Greg's response to my email to the hg mail list, keep components as flat as possible.
Bigger Questions 2 & 3:
There is a better structure as I have laid out above. I believe we have a strong use case for using subrepos and I do not see a viable alternative. As mentioned in #Clare 's answer, there is a camp that believes dependencies can be managed without subrepos. However, I have yet to see any evidence or actual references to back this statement up.
Bigger Question 5:
Still open to better ideas...

Related

Storing PCB files *and* software files in the same Mercurial repo

I have my Banana Pi set up as my Mercurial server. It works well for me for my software as generally speaking I have firmware and that's about it in my repositories. I can access it via open VPN from anywhere in the world. However, I have started to use version control for my PCB files as well now, due to a new CAD system which complicates my old, crude but effective way of doing my PCB archiving and backup. (Also, everything in my new CAD system, all the PCBs and schamtics, are text files which makes version control work nicely.)
So, with Mercurial I started doing as I did with software and creating a new repo for my PCB for one of the boards I'm updating for a customer, and immediately came across an issue that svn seems to cope with easily and I was wondering whether Mercurial can do the same.
I have my BH0001 project repository which has all the embedded C in it and I have started creating a new issue of the PCB for which the C code is used. I had to create a new Mercurial repo called BH0001_pcb to differentiate between code and PCB. With svn you can have a project repo and then Hardware and Software directories within the project number, but still be able to check out the two different types of files to different places independently.
I could, of course, clone the BH0001 software repository to a local machine, add the PCB info in a new folder in the local Mercurial repo send it all back to the server and it would be perfectly happy. The problem then comes when checking out because I would be cloning both firmware and PCB on to a machine when I might only want one or the other.
Also, this goes against how I store stuff locally. In my /username/home directory I have a Software directory and a CAD directory and within those I have projects. So I would have:
home/CAD/CustomerName/BH0001
and
home/Software/CustomerName/BH0001.
If I'm to carry on using my current method do I have to:
Change my local directory structures to be something like:
home/Projects/CustomerName/BH0001/CAD
and
home/Projects/CustomerName/BH0001/Software
Suck it up and use things like ProjectName_pcb for separate repos.
Some other way I can't think of/can't find/am unaware of? e.g. There's a way of checking out part of a Mercurial repository to one directory and a different part of the repo to a different directory.
Or should I just use svn if I really want to carry on as I have?
With default mercurial you currently cannot do partial repository clones as you can do with SVN. So your approach to use separate repositories is a good choice.
However there ways to achieve a similar result: sub-repositories. In your case I'd create a parent repository which contains your two current repositories as sub-repositories. Mind though, sub-repositories have some rough edges, so read the linked page carefully - I'd like to especially stress that it's good practise to have a parent repo which basically only contains the 'real' repos but not much on its own.
There exist ideas like a thin or narrow clone (which is somewhat identical to what SVN does), but I haven't seen them in production.

Mercurial sparse checkout

In this F8 conference video(starting 8:40) from 2015 they speak about the advantages of using Mercurial and a single repository across facebook.
How does this work in practice? Using Mercurial, can i checkout a subdirectory (live in SVN)? If so, how? Do i need a facebook-mercurial-extension for this
P.S.: I only found answers like this or this from 2010 on SO where i am not sure if the answers still apply with all the efforts FB put into it.
From your question it is not clear if you are looking for a workflow (the monorepo vs multiple repos debate) or for performance and scaling for a huge code base.
For the workflow, I suggest googling for monorepo. It has its pros and cons, you need to understand your situation and current workflow to decide. For the performance and scaling, keep reading.
The idea of remotefilelog is not to checkout a subdirectory (as you mention), the idea is to checkout everything. In order to do that in an efficient way, you need two extensions actively developed by Facebook:
remotefilelog. This gives you something conceptually similar to a shallow clone. This reduces hg clone and hg pull time.
fsmonitor (previously called hgwatchman, it is now part of mercurial core). This dramatically reduces time of local operations such as hg status. Note that fsmonitor is independent from remotefilelog. You can start experimenting with this, since it doesn't require any setup on the server side.
With a recent mercurial (which I strongly suggest) you can shave off the additional startup time of the Python interpreter using CommandServer + CHg.
Some additional notes:
I tested extensively fsmonitor. It works very well, on huge repos the time of hg status is reduced from 10 secs to less than 1 sec (and the majority of this 1 sec is Python startup time, see above for CHg). If your repository is really huge, you might need to fine tune some inotify kernel parameters (or the equivalent on MacOSX). The fsmonitor documentation has all the information you need.
I didn't test remotefilelog, although I read everything I found about it and I am sure it works. Depending on how development is done (everybody has always Internet connectivity or not, the organization has its own master repo or not) there can be a caveat: it partially transforms the decentralized hg into a centralized VCS like svn: some operations that normally can be done offline (for example: hg log and the first hg update to a changeset in the past) will now require connectivity to the master repository.
Before considering remotefilelog, I used extensively the largefiles extension on a huge repo. It has the same drawbacks than remotefilelog and some confusing corner cases for users that want to use hg just to get things done without taking the time to understand how it works. If I were to manage another huge repo, I would use remotefilelog instead than largefiles, although their use case is not really the same.
Mercurial has also support for subrepositories (doc1, doc2). The problem is that it changes the behavior of hg depending on where you are in the source tree. Again, if the developers don't care about really understanding how hg works, it will be just too confusing.
Additional information:
Facebook Engineering blog post
scaling mercurial wiki, although not completely up to date
just by googling mercurial facebook.
i am not sure if the answers still apply with all the efforts FB put
into it
(Early 2017) The answers in the questions linked still apply (because they occasionally get updated) but note that you will have to read all the comments and answers.
remotefilelog essentially allows on demand shallow clones (so you don't fetch the history for everything for all time) but you still fetch the essential metadata for, and checkout across, all the directories of the repo at the desired revision.
Using Mercurial, can i checkout a subdirectory (li[k]e in SVN)? If so, how?
https://stackoverflow.com/a/40355673/7836056 discusses how you might use third party extensions to allow narrow/sparse checkouts (Facebook's sparse.py) or narrow clones (Google's NarrowHG) with Mercurial thus only "creating" a single directory from within the main repository (albeit with radically different tradeoffs).
(Note phrasing matters: "sparse checkout" means a very specific action when referring to distributed version control in a manner that doesn't exist when using it to refer to centralised version control)

TFS 2012 version control vs Mercurial

I'm evaluating a version control system for our team (of 1 to 2 developers) and am wondering how TFS 2012 version control compares to Mercurial in terms of merging and branching. We don't have a large team and it might just be me and maybe another developer so we may not need robust branching/merging capabilities that a DVCS offers. I haven't been able to find much info on TFS 2012 version control in terms of its capabilities. Can anybody point me to a guide or highlight some features? Please note, that we only have access to standalone version of TFS 2012 on VS professional version.
Thanks
Mercurial branching differs from TFS in one critical way. Whereas TFS stores your branches in a different part of the filespace, Mercurial stores them in the history itself. This massively reduces the friction involved in branching and merging. It means, for example that:
Branches are created implicitly, simply by updating to a revision other than the latest, then committing on top of that.
Branches are also closed implicitly, again, simply by merging.
Branches can be re-opened implicitly, the same way as you create them in the first place.
Mercurial lets you switch between branches in-place very easily. You don't need to set up separate workspace mappings, and you don't need to change from one directory to another.
Branches can be anonymous.
Whereas TFS has "baseless merges" (which basically means it has massive problems with branches that aren't in a direct parent/child relationship), there is no such thing in Mercurial. It is always possible to find a common origin between two revisions, no matter how far apart their respective branches have diverged.
The graphical view of your branches in TortoiseHg is integrated with your project history, making it much easier to understand how branching and merging actually works in the first place.
There are several advantages of branching and merging which apply even to small teams or solo developers. One such example is being able to fix bugs on your production system even when you have other features in development. Another example is exploratory or experimental development: if one approach doesn't work out, you can easily roll back to the start and try a second, different approach, while still keeping the original approach around in case you need to refer back to it.
The previous "answers" are a little strange. Branching and merging in TFS 2012 works exactly like you would expect it to, with both a GUI and command line implementation. Here's a good document that covers the topic fairly completely:
ALM Ranger Guide updated
Preface:
I'm not related in any way with TFS and never used it
Face:
Newest 'tfs2012' question, at least some (How to set up TFS to cater multiple projects to multiple clients, TFS 2012: Correllating binaries to builds and source code, TFS 2012 Disable Multiple Check-out not working), demonstrate us problems, which I never consider even as question in Mercurial (with some added external tools sometimes). But you can understand also, that TFS positioned as full ALM-tool, not as pure SCM, which Mercurial is.
"Team Foundation Server is the Lotus Notes of version control tools" from James McKay is not freshest article (Feb 2011), but at least some statement are still valid, I think.
Amount of branching-merging will correlate to team-size only partially, mainly depends on the style of development, habits of developers
If you are in a small team, go with Mercurial without a doubt.
Mercurial is lightweight and flexible with regards to the specific development workflow you will use. It fits smaller teams much better, in fact, it even fits a solo dev because it does not impose any administrative overhead.
About the need for merging, it's really not an advanced feature as you imply: it is a very fundamental feature that most (all?) non-distributed VCS just can't get right. You might not need it now, but if you ever do, you'll have it right there in Mercurial and it just works.

Multiple parallel versions on Mercurial

Suppose I'm developing software for different clients with a common need.
I develop the system with a single line of development, and that is ok. Each client has the software installed on a server, and updates are made by pulling from a bitbucket repository.
Later, one of the clients asks for a custom module for his own needs only. Other clients may also have other custom needs as well.
My problem is: How can I manage the customization, pulling (updating) new versions of the main line of development, and the customs only when they are present?
One option: make the customisation part of the codebase rather than managed through Mercurial. There are lots of good reasons for doing this, including that there's no danger of you forking and ever maintaining separate version of code, or indeed maintaining separate branches of a repository tree. Both of those are pretty dull tasks that could lead to problems, so I'd favour that if I were you. It depends on your language but I would aim for keeping a load of modules in the MainCodeBase with some way for users to configure which are used.
Second option: sub repositories. Make a repo for Customer1 which includes that customer's modules and has a sub-repository of MainCodeBase. Customer1 just needs to recursively pull their repository to get the latest modules and the latest MainCodeBase. Problem here is it starts to get ugly when you want more than one customer to use the same module.

Monolithic vs multiple Mercurial repos for modules within a suite of related applications

At my organization, we'd like to switch from using CVS to Mercurial for a variety of reasons. We've done a lot of investigation when trying to determine how we should organize our Hg repositories based on what we have in our codebase and how we tend to work. We've come up with satisfactory answers to most of our questions, but there's one point that's stumping us a little, and it's driving me mad because the most convenient way to organize the repo for our workflow just seems like the wrong way to go from a conceptual standpoint. I'm trying to figure out whether our perception of how this is "supposed" to work is wrong, or whether we're just bumping up against a legitimate feature gap in the available tooling.
Primarily we maintain a medium sized codebase consisting of a suite of applications that get all get released in the same package. Conceptually you can divide our code into three categories:
Shared code
Application code for our primary suite (uses the shared code)
Miscellaneous small utilities that are infrequently maintained (uses the shared code)
This doesn't seem unusual to me, but I want to stress the point that we maintain the application code and the shared code at the same time and always want them to be bleeding edge with respect to each other. That is, we want all our application builds to always use the latest version and the same version of the shared code. We frequently add to or modify application code and shared code at the same time. Currently, the shared code is in one CVS module, and the applications are all in their own separate modules. The shared code and application modules are checked out such that the shared code gets built once and then linked into each application. We frequently do cvs commits that include changes across shared and application modules at once. We would really like to keep that ability.
I understand that commits in Hg are atomic within repositories -- that's fine but I'd like to be able to diff and commit to an application and a shared library at the "same time" (i.e. I don't care if it's really atomic but I don't want to have to manually do two separate diffs and two separate commit actions).
Conceptually, it seems like it would be correct to have one or a few repos for the shared code, and a separate repo for each application and each little utility program. This means you'd need to check out multiple repos for each build but that isn't a problem for us. The problem is there doesn't seem to be any tooling that lets you view updates or changes on multiple repos at once, or diff multiple repos at once and then sequentially commit them for you. This would be easy to script, but that wouldn't help developers who want to use various GUI frontends to complement the command line.
It seems like in order to be able to commit across multiple codebases at once (from a user's perspective) and keep everything on the bleeding edge together, the only two solutions are:
Use a monolithic repo with EVERYTHING in it.
Create some subrepos but access/commit everything through a big monolithic "main" repo that contains all the subrepos to keep everything on the latest revisions (which doesn't seem any better than (1) to me).
It can't be that unusual to want to work with multiple "peer" repositories at the same time, even if the actions aren't truly atomic across all of them -- and yet I'm not finding tons of articles or posts clamoring for this ability.
In summary:
We would like to organize our code such that we can diff and commit application code and shared code at the same time from the user's perspective (they need not truly be atomic).
It seems like we should be putting application code and shared code in separate repositories.
Subrepositories tie parent repositories to specific revisions, which we do not want.
What am I missing here?
In my shop, we have many projects that are simply in separate repos, but the main application's repo has 2 other projects in it. One is a module that shares a significant amount of code with the main application, and the other is for database migrations for the application (it's even in a different language). I wanted related changes in both the application and the migrator to be committed together, inseparably. Altogether, all source files in this repo are between 10 and 11 MB.
So if putting everything in one repository is really what makes sense because you don't want to deal with subrepositories, then there's nothing wrong with putting everything in one repository. The one of mine is on the small side of medium, in my opinion. TortoiseHg's source is around 20 MB, OGRE is over 100 MB.
Without knowing more about your projects and their relationships, the impression I get is that a single repository would work just fine, and that you're not looking at this incorrectly.
If you change your mind, hg convert can help you extract projects into their own repository, maintaining the history of those files.
If the one-repository approach is not for you, then I think subrepos should be given a chance, as that is the only other method I know of for treating multiple repos cohesively that is supported in TortoiseHg (see the Recommendations section).
However, I'm not sure how you would deal with the inter-department access, given that it doesn't seem there is an established subset already shared with others.