Pulling and working on Mercurial repository with multiple developers

Pulling and working on Mercurial repository with multiple developers - mercurial

Please bear with me....
(Edit...)
╔════════════╦═══════════════════╦══════╗
║ repo name ║ role ║ user ║
╠════════════╬═══════════════════╬══════╣
║ RepoMain ║ production ║ Mr.A ║
║ RepoTest ║ test server ║ QA ║
║ RepoB_vm ║ Mr.B's vm ║ Mr.B ║
║ RepoB_home ║ Mr.B's final repo ║ Mr.B ║
║ RepoC_vm ║ Mr.C's vm ║ Mr.C ║
║ RepoC_home ║ Mr.C's final repo ║ Mr.C ║
╚════════════╩═══════════════════╩══════╝
You can imagine Mr.A works with other people so he has his own repository (same project)
There are several hot newbiew questions that I think I am still not over with.
basic workflow working on your own VM (virtual machine)
Commit your changes --> pull from test server to Repo_vm --> run your test
on vm --> success then ask QA to pull from Repo_home
Is this the best workflow possible? I am always afraid of merge problem (sometimes newer changes went missing.. I had that one terrible experience).
I don't think there is any big deal with production <--- test server as it's one-way. That sounds like a safe merge.
But multiple developers using the same test server repo, if we do this we will end up with Michael Myers chasing us down.
To expand the above workflow more explicitly...
commit changes on vm
pull from test server
run tests on vm
if all passes, update home repo
ask QA to pull from repo_home
With pull request in mind, is this a better workflow?
commit changes on your own VM
pull changes from test server for the latest alpha version
run tests locally
if all goes well, push to your home repo on your own account
submit a pull request
if you are at the front of the queue, make a clone version on the test server (sandbox envrionment) and then do the merging (test server might have the latest that is different from the last alpha version committed in home repo)
if test passes, he then tell QA to pull from merged sandbox repo
run tests
push to production on scheulde
Q2: What do you mean by QA giving limited time?
Q3: How often should developers pull from test server (contains the latest alpha stable version)?
Thanks.

Merging is a tricky problem. Mercurial does a pretty good job of handling things automatically, but it can't solve conflicts. That is best left to a person, and the best people for doing that are the developers making the changes. Don't make the QA merge anything. Merging conflicts requires careful attention to detail and should not be taken lightly. Careless merging is a problem no software can solve.
I think your workflow is fine. QA should treat pull requests like a queue. When a developer gets to the front of the queue, he's given an opportunity to pull, merge, and test. Once he's finished, he notifies QA, who then pull his changes. Since no other code has entered the repository, QA is guaranteed not to have to merge.
QA could also give developer's a limited amount of time to merge, build, and test, depending on the speed of your processes and the size of a developer's changes. That way, you don't get a huge queue of changes piling up behind the poor developer struggling to get things working.

Related

Sync Mercurial repositories with two central servers without connection

I'm new to mercurial and I have problems with the solution we're working to implement in my company. I work in a Lab with a strict security environment and the production Mercurial server is in a isolated network. All people have two computers, one to work in the "real world" and another one to work in the isolated and secure environment.
The problem is that we have other Labs distributed around the world and in some cases we need to work together two or more Labs in a project. Every Lab has a HG server for to manage their own projects locally, but I'm not sure if our method to sync common projects is the best solution. To solve that, we use a "bundle" to send the news changesets from one Lab to another. My question is about how good is this method because the solution is a little bit complicate. The procedure is more or less that way:
In Lab B, hg pull and update to be sure about the last version in local folder.
Ask the other about the "hg log", to see what are the last common changeset.
In Lab A: hg pull and update to be sure about the last version in local folder.
In Lab A: Make bundle, "hg bundle --base XX project.bundle" (where XX is the last common changeset).
Send it to Lab B (with a complicated method due the security normative: encrypt files, encrypt drives, secure erases, etc).
In Lab B: "hg unbundle projectYY.bundle" in the local folder.
This process creates two heads, that sometimes force you to make merges.
Once the changesets from Lab A are correctly implemented at Lab B, we need to repeat the process in the opposite direction, to implement the evolution of the project in the Lab B to the Lab A.
Could anyone enlighten me the way to find the best solution to get out of this dilemma?
Anyone have a better solution?
Thanks a lot for your help.

Bundles are the right vehicle for propagating changes without a direct connection. But you can simplify the bundle-building process by modeling communication locally:
In Lab A, maintain repoA (the central repo for local use), as well as repoB, which represents the state of the repository in lab B. Lab B has a complementary set-up.
You can use this dual set-up to model the relationship between the labs as if you had a direct connection, but changeset sharing proceeds via bundles instead of push/pull.
From the perspective of Lab A: Update repoA the regular way, but update repoB only with bundles that you receive from Lab B and bundles (or changesets) that you are sending to Lab B.
More specifically (again from the perspective of Lab A):
In the beginning the repos are synchronized, but as development progresses, changes are committed only to repoA.
When it's time to bring lab B up to speed, just go to repoA and run hg outgoing path/to/repoB. You now know what to bundle without having to request and study lab B's logs. In fact, hg bundle bundlename.bzip repoB will bundle the right changesets for you.
Encrypt and send off your bundle.
You can assume that the bundle will be integrated into Lab B's home repo, so update our local repoB as well, either by pushing directly or (for assured consistency) by unbundling (importing) the bundle that was mailed off.
When lab B receives the bundle, they will import it to their own copy of repoA-- it is now updated to the same state as repoA in lab A. Lab B can now push or pull changes into their own repoB, and merge them (in repoB) with their own unshared changesets. This will generate one or more merge changesets, which are handled just like any other check-ins to lab B's repoB.
And that's that. When lab B sends a bundle back to lab A, it will use the same process, steps 1 to 5. Everything stays synchronized just like they would if the repositories were directly connected. As always, it pays to synchronize frequently so as to avoid diverging too far and encountering merge conflicts.
In fact you have more than two labs. The approaches to keeping them synchronized are the same as if you had a direct connection: Do you want a "star topology" with a central server that is the only node the other labs communicate with directly? Then each lab only needs a local copy of this server. Do you need lots of bilateral communication before some work is shared with everyone? Then keep a local model of every lab you want to exchange changesets with.

If you have no direct network communication between the two mercurial repositories, then the method you describe seems like the easiest way to sync those two repositories.
You could probably save a bit on the process boilerplate on getting the new changesets which need bundling, how exactly depends.
For once, you don't need to update your working copy in order to create the bundles; it suffices to just have the repo, you don't need a working copy.
And if you know the date and time of the last sync, you can simply bundle all changesets added since that time, using an appropriate revset, e.g. all revisions since 30th March this year: hg log -r'date(">2015-03-30")' Thus you could skip a lengthy manual review process.
If your repository is not too big (thus fits on the media you use for exchange), simply copy it there in its entirety and do a local pulls from that exchange disk to sync, skipping those review processes, too.
Of course you will not be able to avoid making the merges - they are the price you have to pay when several people work on the same thing at the same time and both commit to their own repos.

How to prevent so many merges (workflow issue)

Context:
7 devs
1 product
3 branches:
Version 3.6 (stable)
Version 3.7 (stable)
Master (dev)
Rules & Policies in place:
Any fix made in an earlier version must be merged in all future versions.
Integration is continuous: if you fix something in 3.6, you must integrate and test in 3.7 and in master before you push.
When possible, rebase your work before you commit so that stuff you committed two days ago locally will actually be put back on top. I know this is a matter of preference and has pros and cons, but this is what we like best as a team.
Our problem:
We have too many useless merge operations to do. Here is a scenario:
Normal integration work:
Joe and Bill work on two different fixes that go in 3.6.
Joe is done, he pulls (and rebases)
Joe tests one last time in his 3.6 branch
Joe switches to 3.7 and merges 3.6 - merge 1
Joe tests again, this time in the context of 3.7
Joe switches to 3.8 and merges 3.7 - merge 2
Joe tests again, this time in the context of 3.8
Joe is ready to push
Bill did pretty much the same thing but pushed right after Joe pulled
Joe tries to push but the operation fails because it would create a new head
Painful (useless) merges:
Joe pulls, he gets stuff from Bill in 3.6, 3.7 and 3.8
Joe updates to 3.6 and merges changes he received from the pull - merge 3
Joe updates to 3.7 and merges changes he received from the pull - merge 4
Joe still in 3.7 merges 3.6 - merge 5
Joe updates to 3.8 and merges changes he received from the pull - merge 6
Joe still in 3.8 merges 3.7 - merge 7
Joe tries to push and prays that nobody pushed something to 3.6 in the meantime.
We are thinking of writing an extension (or batch or program) to automatically merge this kind of situation. So when Joe finds out that he cannot push, he would just run MergeUpAutomagically. But before we try to fix this, I want to make sure we are using the right workflow.

If I understand, you are using named branches in same clones.
I find it easier to use a different clone for each version (releases, devs), where each clone contains a named branch related to the version (and also changesets from the older branches). We have "official" clones where we synchronize (pulls and pushes).
Advantages:
No need to "switch" by doing hg updates (in my situation I use Eclipse instances with a different work space for each project). I used to work with named branches in same clones but found it confusing.
Easier to see where the changesets come from (from which named branches-versions). Also, if someone pushes from a higher version to an older one by mistake, it is easy to spot.
Synchronization is more "atomic". We pull and push for each "official" clone-named-branch, then pull between "official" named branches (from older to newer).
In your situation, maybe Bill pushed before Joe, but had only time to do it in 3.6 and Joe realized that before synchronizing to higher versions (not sure it would help in your situation). Also, maybe it is not necessary to synchronize the "dev" branches as often as the "releases".

How to use mercurial subrepos for shared components and dependencies?

We develop .NET Enterprise Software in C#. We are looking to improve our version control system. I have used mercurial before and have been experimenting using it at our company. However, since we develop enterprise products we have a big focus on reusable components or modules. I have been attempting to use mercurial's sub-repos to manage components and dependencies but am having some difficulties. Here are the basic requirements for source control/dependency management:
Reusable components
Shared by source (for debugging)
Have dependencies on 3rd party binaries and other reusable components
Can be developed and commited to source control in the context of a consuming product
Dependencies
Products have dependencies on 3rd party binaries and other reusable components
Dependencies have their own dependencies
Developers should be notified of version conflicts in dependencies
Here is the structure in mercurial that I have been using:
A reusable component:
SHARED1_SLN-+-docs
|
+-libs----NLOG
|
+-misc----KEY
|
+-src-----SHARED1-+-proj1
| +-proj2
|
+-tools---NANT
A second reusable component, consuming the first:
SHARED2_SLN-+-docs
|
+-libs--+-SHARED1-+-proj1
| | +-proj2
| |
| +-NLOG
|
+-misc----KEY
|
+-src-----SHARED2-+-proj3
| +-proj4
|
+-tools---NANT
A product that consumes both components:
PROD_SLN----+-docs
|
+-libs--+-SHARED1-+-proj1
| | +-proj2
| |
| +-SHARED2-+-proj3
| | +-proj4
| |
| +-NLOG
|
+-misc----KEY
|
+-src-----prod----+-proj5
| +-proj6
|
+-tools---NANT
Notes
Repos are in CAPS
All child repos are assumed to be subrepos
3rd party (binary) libs and internal (source) components are all subrepos located in the libs folder
3rd party libs are kept in individual mercurial repos so that consuming projects can reference particular versions of the libs (i.e. an old project may reference NLog v1.0, and a newer project may reference NLog v2.0).
All Visual Studio .csproj files are at the 4th level (proj* folders) allowing for relative references to dependencies (i.e. ../../../libs/NLog/NLog.dll for all Visual Studio projects that reference NLog)
All Visual Studio .sln files are at the 2nd level (src folders) so that they are not included when "sharing" a component into a consuming component or product
Developers are free to organize their source files as they see fit, as long as the sources are children of proj* folder of the consuming Visual Studio project (i.e., there can be n children to the proj* folders, containing various sources/resources)
If Bob is developing SHARED2 component and PROD1 product, it is perfectly legal for him to make changes the SHARED2 source (say sources belonging to proj3) within the PROD1_SLN repository and commit those changes. We don't mind if someone develops a library in the context of a consuming project.
Internally developed components (SHARED1 and SHARED2) are generally included by source in consuming project (in Visual Studio adding a reference to a project rather than browsing to a dll reference). This allows for enhanced debugging (stepping into library code), allows Visual Studio to manage when it needs to rebuild projects (when dependencies are modified), and allows the modification of libraries when required (as described in the above note).
Questions
If Bob is working on PROD1 and Alice is working on SHARED1, how can Bob know when Alice commits changes to SHARED1. Currently with Mercurial, Bob is forced to manually pull and update within each subrepo. If he pushes/pulls to the server from PROD_SLN repo, he never knows about updates to subrepos. This is described at Mercurial wiki. How can Bob be notified of updates to subrepos when he pulls the latest of PROD_SLN from the server? Ideally, he should be notified (preferable during the pull) and then have to manually decide which subrepos he wants to updated.
Assume SHARED1 references NLog v1.0 (commit/rev abc in mercurial) and SHARED2 references Nlog v2.0 (commit/rev xyz in mercurial). If Bob is absorbing these two components in PROD1, he should be be made aware of this discrepancy. While technically Visual Studio/.NET would allow 2 assemblies to reference different versions of dependencies, my structure does not allow this because the path to NLog is fixed for all .NET projects that depend on NLog. How can Bob know that two of his dependencies have version conflicts?
If Bob is setting up the repository structure for PROD1 and wants to include SHARED2, how can he know what dependencies are required for SHARED2? With my structure, he would have to manually clone (or browse on the server) the SHARED2_SLN repo and either look in the libs folder, or peak at the .hgsub file to determine what dependencies he needs to include. Ideally this would be automated. If I include SHARED2 in my product, SHARED1 and NLog are auto-magically included too, notifying me if there is version conflict with some other dependency (see question 2 above).
Bigger Questions
Is mercurial the correct solution?
Is there a better mercurial structure?
Is this a valid use for subrepos (i.e. Mercurial developers marked subrepos as a feature of last resort)?
Does it make sense to use mercurial for dependency management? We could use yet another tool for dependency management (maybe an internal NuGet feed?). While this would work well for 3rd party dependencies, it really would create a hassle for internally developed components (i.e. if they are actively developed, developers would have to constantly update the feed, we would have to serve them internally, and it would not allow components to be modified by a consuming project (Note 8 and Question 2).
Do you have better a solution for Enterprise .NET software projects?
References
I have read several SO questions and found this one to be helpful, but the accepted answer suggests using a dedicated tool for dependencies. While I like the features of such a tool it does not allowed for dependencies to be modified and committed from a consuming project (see Bigger Question 4).

This may not be the answer you were looking for, but we have recent experience of novice Mercurial users using sub-repos, and I've been looking for an opportunity to pass on our experience...
In summary, my advice based on experience is: however appealing Mercurial sub-repos may be, do not use them. Instead, find a way to lay out your directories side-by-side, and to adjust your builds to cope with that.
However appealing it seems to be to tie together revisions in the sub-repo with revisions in the parent repo, it just doesn't work in practice.
During all the preparation for the conversion, we received advice from multiple different sources that sub-repos were fragile and not well-implemented - but we went ahead anyway, as we wanted atomic commits between repo and sub-repo. The advice - or my understanding of it - talked more about the principles rather than the practical consequences.
It was only once we went live with Mercurial and a sub-repo, that I really understood the advice properly. Here (from memory) are examples of the sorts of problems we encountered.
Your users will end up fighting the update and merge process.
Some people will update the parent repo and not the sub-repo
Some people will push from the sub-repo, ang .hgsubstate won't get updated.
You will end up "losing" revisions that were made in the sub-repo, because someone will manage to leave the .hgsubstate in an incorrect state after a merge.
Some users will get into the situation where the .hgsubstate has been updated but the sub-repo hasn't, and then you'll get really cryptic error messages, and will spend many hours trying to work out what's going on.
And if you do tagging and branching for releases, the instructions for how to get this right for both parent and sub-repo will be many dozens of lines long. (And I even had a nice, tame Mercurial expert help me write the instructions!)
All of these things are annoying enough in the hands of expert users - but when you are rolling out Mercurial to novice users, they are a real nightmare, and the source of much wasted time.
So, having put in a lot of time to get a conversion with a sub-repo, several weeks later we then converted the sub-repo to a repo. Because we had large amounts of history in the conversion that referred to the sub-repo, via .hgsubstate, it's left us with something much more complicated.
I only wish I'd really appreciated the practical consequences of all the advice much earlier on, e.g. in Mercurial's Features of Last Resort page:
But I need to have managed subprojects!
Again, don't be so sure. Significant projects like Mozilla that have tons of dependencies do just fine without using subrepos. Most smaller projects will almost certainly be better off without using subrepos.
Edit: Thoughts on shell repos
With the disclaimer I don't have any experience of them...
No, I don't think many of them are. You are still using sub-repos, so all the same user issues apply (unless you can provide a wrapper script for every step, of course, to remove the need for humans to supply the correct options to handle sub-repos.)
Also note that the wiki page you quoted does list some specific issues with shell repos:
overly-strict tracking of relationship between project/ and somelib/
impossible to check or push project/ if somelib/ source repo becomes
unavailable lack of well-defined support for recursive diff, log, and
status recursive nature of commit surprising
Edit 2 - do a trial, involving all your users
The point at which we really started realising we had an issue was once multiple users started making commits, and pulling and pushing - including changes to the sub-repo. For us, it was too late in the day to respond to these issues. If we'd known them sooner, we could have responded much more easily and simply.
So at this point, the best advice I think I can offer is to recommend that you do a trial run of the project layout before the layout is carved in stone.
We left the full-scale trial until too late to make changes, and even then people only made changes in the parent repo, and not the sub-repos - so we still didn't see the full picture until too late.
In other words, whatever layout you consider, create a repository structure in that layout, and get lots of people making edits. Try to put enough real code into the various repos/sub-repos so that people can make real edits, even though they will be throw-way ones.
Possible outcomes:
You might find it all works fine - in which case, you'll have spent some time to gain certainty.
On the other hand, you might identify issues much more quickly than spending time trying to work out what the outcomes would be
And your users will learn a lot too.

Question 1:
This command, when executed in the parent "shell" repo will traverse all subrepos and list changesets on from the default pull location that are not present:
hg incoming --subrepos
The same thing can be accomplished by clicking on the "Incoming" button on the "Synchronize" pane in TortoiseHg if you have the "--subrepos" option checked (on the same pane).
Thanks to the users in the mercurial IRC channel for helping here.
Questions 2 & 3:
First I need to modify my repo structures so that the parent repos are truly "shell" repos as recommended on the hg wiki. I will take this to the extreme and say that the shell should contain no content, only subrepos as children. In summary, rename src to main, move docs into the subrepo under main, and change the prod folder to a subrepo.
SHARED1_SLN:
SHARED1_SLN-+-libs----NLOG
|
+-misc----KEY
|
+-main----SHARED1-+-docs
| +-proj1
| +-proj2
|
+-tools---NANT
SHARED2_SLN:
SHARED2_SLN-+-libs--+-SHARED1-+-docs
| | +-proj1
| | +-proj2
| |
| +-NLOG
|
+-misc----KEY
|
+-main----SHARED2-+-docs
| +-proj3
| +-proj4
|
+-tools---NANT
PROD_SLN:
PROD_SLN----+-libs--+-SHARED1-+-docs
| | +-proj2
| | +-proj2
| |
| +-SHARED2-+-docs
| | +-proj3
| | +-proj4
| |
| +-NLOG
|
+-misc----KEY
|
+-main----PROD----+-docs
| +-proj5
| +-proj6
|
+-tools---NANT
All shared libs and products have there own repo (SHARED1, SHARED2, and PROD).
If you need to work on a shared lib or product independently, there is a shell available (my repos ending with _SLN) that uses hg to manage the revisions of the dependencies. The shell is only for convenience because it contains no content, only subrepos.
When rolling a release of a shared lib or product, the developer should list the all of the dependencies and their hg revs/changesets (or preferably human friendly tags) that were used to create the release. This list should be saved in a file in the repo for the lib or product (SHARED1, SHARED2, or PROD), not the shell. See Note A below for how this could solve Questions 2 & 3.
If I roll a release of a shared lib or product I should put matching tags in the in the projects repo and it's shell for convenience, however, if the shell gets out of whack (a concern expressed from real experience in #Clare 's answer), it really should not matter because the shell itself is dumb and contains no content.
Visual Studio sln files go into the root of the shared lib or product's repo (SHARED1, SHARED2, or PROD), again, not the shell. The result being if I include SHARED1 in PROD, I may end up with some extra solutions that I never open, but it doesn't matter. Furthermore, if I really want to work on SHARED1 and run it's unit tests (while working in PROD_SLN shell), it's really easy, just open the said solution.
Note A:
In regards to point 3 above, if the dependency file use a format similar to .hgsub but with the addition of the rev/changeset/tag, then getting the dependencies could be automated. For example, I want SHARED1 in my new product. Clone SHARED1 to my libs folder and update to the tip or the last release label. Now, I need to look at the dependencies file and a) clone the dependency to the correct location and b) update to the specified rev/changeset/tag. Very feasible to automate this. To take it further, it could even track the rev/changeset/tag and alert the developer of there is dependency conflict between shared libs.
A hole remains if Alice is actively developing SHARED1 while Bob is developing PROD. If Alice updates SHARED1_SLN to use NLog v3.0, Bob may not ever know this. If Alice updates her dependency file to reflect the change then Bob does have the info, he just has to be made aware of the change.
Bigger Questions 1 & 4:
I believe that this is a source control issue and not a something that can be solved with a dependency management tool since they generally work with binaries and only get dependencies (don't allow committing changes back to the dependencies). My dependency problems are not unique to Mercurial. From my experience, all source control tools have the same problem. One solution in SVN would be to just use svn:externals (or svn copies) and recursively have every component include its dependencies, creating a possibly huge tree to build a product. However, this falls apart in Visual Studio where I really only want to include one instance of a shared project and reference it everywhere. As implied by #Clare 's answer and Greg's response to my email to the hg mail list, keep components as flat as possible.
Bigger Questions 2 & 3:
There is a better structure as I have laid out above. I believe we have a strong use case for using subrepos and I do not see a viable alternative. As mentioned in #Clare 's answer, there is a camp that believes dependencies can be managed without subrepos. However, I have yet to see any evidence or actual references to back this statement up.
Bigger Question 5:
Still open to better ideas...

Mercurial sub-repositories

I read the tutorial many times and I feel that I am still missing something.
I'll just try to give a concrete scenario. Please help me find where I'm
wrong.
Suppose I have a repository which everyone considers as "central". This
means that every new developer clones from it and pull/push from/to it.
Central contains three folders-
Infra (which is about to be a shared code)
.hg
infra.txt
dev1
dev1.txt
.hgsub (in which there's a line --> infra = (path of infra) )
infra (subrepo)
.hg
infra.txt
dev2
dev2.txt
.hgsub (the same as in dev 1 - infra = (path to infra) )
infra (subrepo)
.hg
infra.txt
Now, suppose that one developer clones dev1, and another one clones dev2.
What I see is that when the developer of dev1 changes infra and pushes the
changes to the repository in central, the only way of dev2 developer to know
about the change in infra is to manually search for incoming change-sets in
infra as a sub-repository. Generally, It means that if my project has many
sub-repositories (that may themselves contain some more sub-repositories) ,
I have no way to know about the changes except for going over my
sub-repositories manually.
I think that's not the way to work...
Can anyone help?
Thanks in advance,
Eyal

I think I have found something better.
You can use --subrepos flag when checking for incoming change-sets in a repository.
This will search for incoming change-sets recursively, and show us the sub-repositories in which change-sets can be pulled.
This way, one can control on which sub-repositories are changed, and whether she wants to get up-to date files in those sub-repositories.

You are going to have to pull for each repository. You might think this tedious but there's no way mercurial is going to make the decision to pull changes into your repository for you - this is a good thing.
What you can do is create a simple batch script that runs a 'hg pull' command against each repository. That at least automates the process so it feels less tedious when you really want to pull from all repos.
We moved all our subrepos into one repository which makes it much simpler to manager a change/new feature that requires alterations to all our libraries.
I like subrepos but I think they are best suited for pulling in entire repositories that others look after that remain pretty stable. When there's a lot of changes, you need a lot of discipline and a certain amount of scripting to keep manual work down to a minimum.

For Mercurial, having 2 clones can work the same as having 2 branches?

Since I want to diff all the changes I made since 7 or 10 days ago, without seeing the changes of other team members, so I keep a clone, say
c:\dev\proj1
and then I keep another clone that is
c:\dev\proj2
so I can change code for proj1, and then in another shell, pull code from it, and merge with other team members, and run test. And then 10 days later, I can still diff all the code made by me and nobody else by going to the shell of proj1 and do a hg diff or hg vdiff.
I think this can be done by using branch as well. Does having 2 clones like this work exactly the same as having 2 branches? Any advantage of one over the other method?

The short answer is: Yes.
Mercurial doesn't care where the changesets come from, when you merge. In that sense, branches and clones work equally well when it comes time to merge changes.
Even better: The workflow you described is exactly the strategy in Chapter 3 of the Mercurial book.
The only advantage of branches is that they have a name, so you have less incentive to merge right off. If you want to keep those proj2 changes separate, while still pushing and pulling them from proj1, give them a real branch. Again, functionally, they're the same.
And yes, this is characteristic of DVCS, not uniquely Mercurial.

Note : I'm more familiar with git than hg but the ideas should be the same.
The difference will become apparent if you update both the clones (which are both editing the same branch) e.g. for a quick bug fix on the integration sandbox.
The right way would be for you to have a topic branch (your first clone) which is where you do your development and another one for integration (your second clone). Then you can merge changes from one to another as you please. If you do make a change on the integration branch, you'll know that it was made there.

hg diff -r <startrev> -r <endrev> can be used to compare any two points in Mercurial's history.
Example history:
rev author description
--- ------ ----------------------
# 6 me Merge
|\
| o 5 others More other changes.
| |
| o 4 others Other changes.
| |
o | 3 me More of my changes.
| |
o | 2 me My changes.
|/
o 1 others More Common Changes
|
o 0 others Common Changes
If revision 1 was the original clone:
Revs 2 and 3 represent your changes.
Revs 4 and 5 are other changes made during your branch development. They are pulled merged into your changes at rev 6.
At this point, to see only changes by me before the merge, run hg diff -r 1 -r 3 to display those changes only.

Why not simply have two branches? (Branching/merging is much easier and safer in a DVCS like Hg or Git than in a centralised VCS like TFS or SVN!) It would be much more secure and reliable.
This will become apparent e.g. when you will want to merge the two branches/clones back together. Also, editing one branch from two different physical locations can easily lead to confusion and errors. Hg is designed to avoid exactly these kinds of situations.
Thomas

As some answers already pointed out, branches (named or anonymous) are usually more convenient than two clones because you don't have to pull/push.
But two clones have the distinct advantage of total physical separation, so you can literally work on two things at the same time, and you don't ever need to rebuild when you switch project.
Earlier I asked a question about concurrent development with hg, with option 1 being two clones and option 2 being two branches.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008