Hg sub-repository dependencies

Hg sub-repository dependencies - mercurial

There have been a couple of questions about Hg sub-repo dependencies in the past (here and here) but the accepted answers don't seem to address the problem for me.
A project of mine has 4 dependencies: A, B, C, D. D is dependent on A, B and C; and B and C are dependent on A:
I want to use Hg sub-repositories to store them so I can track what version of each they rely on. This is because, while I am using A,B,C and D in this project, other projects will require just A and B. Therefore B and C must track what version of A they need independently of D. At the same time, in my application the versions of B and C referenced by a given version of D must always use the same version of A as that referenced by the given version of D (otherwise it will just fall over at runtime). What I really want is to allow them to reference each other as siblings in the same directory - i.e. D's .hgsub would look like the following, and B and C's would look like the first line.
..\A = https:(central kiln repo)\A
..\B = https:(central kiln repo)\B
..\C = https:(central kiln repo)\C
However this doesn't seem to work: I can see why (it'd be easy to give people enough rope to hang themselves with) but its a shame as I think its the neatest solution to my dependencies. I've read a few suggested solutions which I'll quickly outline and why they don't work for me:
Include copies as nested sub-directories, reference these as Hg sub-repositories. This yields the following directory structure (I've removed the primary copies of A, B, C, B\A, C\A as I can accept referencing the copies inside \D instead):
project\ (all main project files)
project\D
project\D\A
project\D\B
project\D\B\A
project\D\C
project\D\C\A
Problems with this approach:
I now have 3 copies of A on disk, all of which could have independent modifications which must be synced and merged before pushing to a central repo.
I have to use other mechanisms to ensure that B, C and D are referencing the same version of A (e.g. D could use v1 while D\B could use v2)
A variation: use the above but specify the RHS of the .hgsub to point to a copy in the parent copy (i.e. B and C should have the .hgsub below):
A = ..\A
Problems with this approach:
I still have three copies on disk
The first time I clone B or C it will attempt to recursively pull the referenced version of A from "..\A", which may not exist, presumably causing an error. If it doesn't exist it gives no clue as to where the repo should be found.
When I do a recursive push of changes, the changes in D\B\A do not go into the shared central repo; they just get pushed to D\A instead. So if I push twice in a row I can guarantee that all changes will have propagated correctly, but this is quite a fudge.
Similarly if I do a (manual) recursive pull, I have to get the order right to get the latest changes (i.e. pull D\A before I pull D\B\A)
Use symlinks to point folder \D\B\A to D\A etc.
Problems with this approach:
symlinks cannot be encoded in the Hg repo itself so every time a team member clones the repo, they have to manually/with a script re-create the symlinks. This may be acceptable but I'd prefer a better solution. Also (personal preference) I find symlinks highly unintuitive.
Are these the best available solutions? Is there a good reason why my initial .hgsub (see top) is a pipe-dream, or is there a way I can request/implement this change?
UPDATED to better explain the wider usage of A,B,C,D

Instead of trying to manage your dependencies via Mercurial (or with any SCM for that matter), try using a dependency management tool instead, such as Apache Ivy.
Using an Ivy based approach, you don't have any sub-repos, you would just have projects A, B, C and D. A produces an artifact (e.g. a .jar, .so or .dll, etc), which is published into an artifact repository (basically a place where you keep your build artefacts) with a version. Projects B and C can then depend on a specific version of A (controlled via a ivy.xml file in each project) which Ivy will retrieve from the artifact repository. Projects B and C also produce artefacts that are published to your repository. Project D depends on B and C and Ivy can be told to retrieve the dependencies transitively, which means it will get the artifacts for B, C and A (because they depend on A).
A similar approach can be used with Apache Maven and Gradle (the later uses Ivy)
The main advantages are that:
it makes it very clear what versions of each component a project is using (sometimes people forget to check .hgsub, so they don't know they are working with subrepos),
it makes it impossible to change a dependant project (as you are working with artifacts, not code)
and it saves you from having to rebuild dependent projects and being unsure of what version you are using.
saves you from having multiple redundant copies of projects that are used by other projects.
EDIT: Similar answer with a slightly different spin at Best Practices for Project Feature Sub-Modules with Mercurial and Eclipse?

You say you want to track which version they each rely on but you'd also be happy with a single copy of A shared between B, C and D. These are mutually exclusive - with a single copy of A, any change to A will cause a change in the .hgsub of each of B, C and D, so there is no independence in the versioning (as all of B, C and D will commit after a change to A).
Having separate copies will be awkward too. If you make a change that affects both B's copy of A and C's copy then attempt to push the whole structure, the changes to (say) B will succeed but the changes to C will fail because they require merging with the changes you just pushed from B, to avoid creating new heads. And that will be a pain.
The way I would do this (and maybe there are better ways) would be to create a D repo with subrepos of A, B and C. Each of B and C would have some untracked A-location file (which you're prompted to enter via a post-clone hook), telling your build system where to look for its A repository. This has the advantage of working but you lose the convenience of a system which tracks concurrent versions of {B, C} and A. Again, you could do this manually with an A-version file in each of B or C updated by a hook, read from by a hook, and you could make that work, but I don't think it's possible using the subrepos implementation in hg. My suggestions really boil down to implementing a simplified subrepo system of your own.

Related

How to work with Mercurial fork parent repository?

For example exists hg rep A - project setup environment. It contains following files:
//project A
.some_config_file
script_1
After project B forked from A, some changes was made.
// project B
M .some_config
M script_1
Parallel in project A there has been improved some features or bug fixed in script_1.
// project A
M script_1
When I try to pull new features (hg pul -u 'repA') to B from A, it brings old .some_config back to repository and overwrites actual one.
And there is my questions:
How do I resolve this conflicts?
How to pull partially changes from fork parent?
And what the best practice to work with fork parent?
Pulling from forked rep, pollutes local one.

You seem to be unfamiliar with the distinction of your 'working copy' and the repository as a tree of individual changesets.
The solution likely is: update your working copy to your fork B. Then merge the original project, fork A, into your currently checked-out version, into fork B. Take care to only accept those changes during the merge which you want to be merged - and discard any changes made to .some_config
Besides that, it's often a bad idea to have config files in a repo. Only have example config files there (and name them such) and keep the actual config file outside, untracked.

Merge a mercurial repository into a subdirectory of another one

I need to merge the all new changesets of a mercurial repository (lets call it A) into a subdirectory of another mercurial repository (lets call it B) on a regular basis. This means, just copying all files is not an option, as the files in B may also be changed and a proper merge must be done.
The only thing I've found to far is http://hgtip.com/tips/advanced/2009-11-17-combining-repositories/, which is about combining repositories, and not really merging over and over again including changesets.
Any ideas? Thanks in advance.

This sounds like a case where you want to make use of sub-repositories: Including repository A as a sub-repository into the parent repository B.
Let's assume, that everything of A is found in the path B/A.
That way, you would have locally your repository B and inside it another repository. You can then go to repository A, pull from the other repository A' and do the merge however you see fit. Then go back to parent level, to repository B. Update the tracked status of A and that's it. If you want to make any local changes which affect both, A and B, use the recursive hg commit --subrepos. See https://www.mercurial-scm.org/wiki/Subrepository for a more verbose description.
Mind however, that sub-repositories are a feature of last resort; that means it has some rough edges. One of them is that it's virtually impossible to undo the entanglement of the two repositories.
Maybe similar features like Guest Repositories, HG Nested or Forest Extensions are better suited for your actual use case.

Dealing with shared dependencies in subdir projects

This is (i'm guessing) a not too-too rare occurrence, but how do people deal with common links in subrepos? It may just be one of the things thats why subdirs are a pain in the butt to use.
Subrepo A has a subrepo B at rev 5
Subrepo C has a subrepo B at rev 10
Subrepo D has A and C. There is now a conflict between the Bs in some build systems.
So you get the dependency structure:
D___A__B
\__C__/
Even if you managed to get A and C pointed at the same revision, there is still two copies of the code that make a conflict.
What would probably be better is to say "A requires B at rev 5." and "C requires B at rev 10." and "D requires A at ref X" and "D requires B at ref X" and "A is here, B is here, C is here, D is here, FIND THE CONFLICTS" but i don't think that is currently possible.

You don't deal with it, this is instead a not-too often case.
First of all, it is greatly discouraged to have a sub-repo outside the directory, and that setup requires at least one outside sub-repo.
Second, even if you setup an outside repo, it is even less recommended to have more than one repo holding the reference to a single repo.
So yes, it is possible, but getting in that state is already hard to do, and if someone actually came to me with this, I'd tell him that he should rethink his configuration. There is a bad smell here anyway, and it is not mercurial's job to handle this.

Hg sub repos dependancies

I’m currently migrating from CVSNT to Mercurial but am running into problems with what I used to be able to achieve with CVS modules. I have two projects say A and B which both depend on common code in a directory C. If I make a change to code in directory C, I want the changes to be reflected in both projects A and B. (it seems to be the exact reverse of the problem here)
I thought this could be achieved using subrepos but the .hgsubstate in projects A and B keeps a note of the changeset in the sub repo that I list committed against. I.e. I commit a change in C on project A and I have to manually open B to update and commit. (In actual fact there are many more projects that just A and B and yes I know common code should be in a shared library but the PHB insists!)
Is there a way to achieve this? Ideally I’d like it to be transparent to the user how the repos are structured i.e., they can commit a change to C and not have to realise it is a sub repo of their project. (At the moment tortoise Hg uses an ‘S’ to indicate a subrepo is dirty). I guess what I need is a daddy repo and a partial check out but surely there must be a better way? I'm on Windows so symlinks are out.

The fact that you have to manually update (and commit) the subrepository is very much intentional. This misfeature found in CVS and SVN was intentionally left out.
If it would automatically update subrepositories to the latest commit, there is no guarantee that code will remain working. E.g. if you would change the API of C, neither A nor B will work until you change them accordingly. And because you can never push multiple repositories atomically at the same time, there would unavoidably be a window during which A and B do not work. In practice, this window can grow rather large, especially as soon as one of those depending projects receives less development attention. And if project B would be put on hold for a while, the changes to C for project A would break it in no-time.
Worse, updating to an earlier version would not be able to restore the subrepository to the state it was in at the time. So as soon as you would make a backwards incompatible change to the subrepository, older versions would not work anymore! This greatly reduces the usefulness of version control. It would cause problems with branching and e.g. bisecting would not be able to work.
This is why you are required to manually update to the latest version of C, test whether everything still works, and check in that subrepository update. Thereby tightly locking the repository version to the subrepository version. You sacrifice a little bit of convenience, but you gain code stability, and I hope I was able to make clear why this is much better :).
As for transparently committing to a subrepository from within a parent repository, as far as I know it is planned to do this but so far no-one has actually implemented it.

Does a mercurial subrepository have to be a subdirectory of the main repository?

My project is made up of code in the following locations
C:\Dev\ProjectA
C:\Lib\LibraryB
C:\Lib\LibraryC
Presently each of these folders is a completely independent Mercurial repository. Project A changes all the time, Library B and Library C change rarely.
I currently tag each version of Project A as it is released and (when I remember) put a corresponding tag in the Library B and C repositories.
Can I improve upon this by using subrepositories? Would that require me to make Library B and C a subdirectory of Project A?
If Library B and C must be subdirectories of Project A what do I do if I want to start a Project D that uses Library B but isn't otherwise affiliated with Project A at all?

If Library B and C must be
subdirectories of Project A what do I
do if I want to start a Project D that
uses Library B but isn't otherwise
affiliated with Project A at all?
Any project can exist both independently and as subrepository of another project at the same time. I'll explain by suggesting a workflow.
First of all, each of your projects (A, B, C) should have a blessed repository that is published somewhere:
You could run hgwebdir on your own server, or make use of a Mercurial hosting service like Bitbucket or Kiln. This way developers have a central authorative point to pull/push changes from, and you have something to make backups of.
Now you can make clones of these repositories to work on in two different ways:
directly clone your project. For example:
hg clone http://bitbucket.org/LachlanG/LibraryB C:\Lib\LibraryB
and/or create subrepository definitions by putting a .hgsub file in the root of ProjectA with the following content:
libraries/libraryB = http://bitbucket.org/LachlanG/LibraryB
libraries/libraryC = http://bitbucket.org/LachlanG/LibraryC
These subrepository definitions tell Mercurial that whenever Project A is cloned, it also has to put clones of Library B and Library C in the libraries folder.
If you are working in Project A and commit, then your changes in libraries/LibraryB and libraries/LibraryC will be committed as well. Mercurial will record which version of the libraries is being used by Project A in the .hgsubstate file. The result is that if you hg update to an old version of the project to see how things worked last week, you also get the corresponding version of your libraries. You don't even need to make tags :-)
When you hg push the Project A changes to the blessed repository, Mercurial will also make sure to push the subrepository changes first to their own origin. That way you never accidentally publish project changes which depend on unpublished library changes.
If you prefer to keep everything local, you can still use this workflow by using relative paths instead of URLs in the subrepository definitions.

You can indeed declare B and C subrepos of project A (they will appear as subdirectory, as described in Mercurial Subrepository).
That would improve your release mechanism as it would allow you to:
get all repos in one place (A and under)
reference an exact tag of B and C under A
tag each sub-repo s first if they had any modification
tag A with the information about B and C tags in it (any clone of A will be able to get the exact tags of B and C used by A)
You can also declare B as a subrepo of D, independently of A. What you make in A (regarding B) will have no consequences for B used in D.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008