How to store my binary assets with Mercurial? - mercurial

I'm starting a game development project and my team and I will be using Mercurial for version control and I was wondering what a more appropriate way to store the binary assets for the game would be. Basically, I have 2 options:
Mercurial 2.1 has the largefiles extension, but I don't know too much about it. It seems like it'll solve the 'repository bloat' problem but doesn't solve binary merge conflict issues.
Keeping the binary assets in a SVN checkout, as a subrepo. This way we can lock files for editing and avoid merge conflicts, but I would really like to avoid having to use 2 version control systems (especially one that I don't really like that much).
Any insight/advice or other options I haven't thought of?

As you surmised large files will do what you need. For merging binaries you can set up a merge tool if one exists for your file type. Something like this:
[merge-tools]
mymergetool.priority = 100
mymergetool.premerge = False
mymergetool.args = $local $other $base -o $output
myimgmerge = SOME-PROGRAM-THAT-WILL-MERGE-IMAGES-FOR-YOU
[merge-patterns]
**.jpg = myimgmerge
**.exe = internal:fail
In general though, merging non-text things will always be a pain using a source control tool. Digital Asset Management applications exist to make that less painful, but they're not decentralized or very pleasant with which to work.

You're correct that the largefiles extension will avoid bloating the repository. What happens is that you only download the large files needed by the revision you're checking out. So if you have a 50 MB file and it has been edited radically 10 times, then the versions might take up 500 MB on the server. However, when you do hg update you only download the 50 MB version you need for that revision.
You're also correct that the largefiles extension doesn't help with merges. Infact, it skips the merge step completely and only prompts you like this:
largefile <some large file> has a merge conflict
keep (l)ocal or take (o)ther?
You don't get a chance to use the normal merge machinery.
To do locking, you could use the lock extension I wrote for a client. They wanted it for their documentation department where people would work with files that cannot be easily merged. It basically turns Mercurial into a centralized system similar to Subversion: locks are stored in a file in the central repository and client contact this repository when you run hg locks and before hg commit.

Related

Creating Mercurial subrepositories, while maintaining the history

I am about to make some major changes to my Mercurial repositories. As I am going to be using a Feature of Last Resort, I am looking for some advice and reassurance that I am not doing something stupid.
Where I Am:
I have a Mercurial repository with a complete history of all of these files:
/source
/secret_subsystem
/unclassified_subsystem
/common_files
Source is the Mercurial repository.
The secret subsystem folder contains code which is intellectual property we want to keep in-house.
The unclassified subsystem folder contains code which we want to outsource to a third-party to maintain.
The common files folder contains code that both subsystems depend on. We will be keeping ownership, but we want to share it with the third-party.
Obviously, I can't just push out my whole repository to the third-party company. The third-party would see too much.
Where I Want To Be:
Having read up on subrepositories, this is where I think I need to be:
Have THREE subrepositories: secret_subsystem, unclassified_subsystem, common_files.
Ensure there are no other files at the /source level, due to this recommendation.
Have the outsourcers create a brand new respository at the source level on their machines, and two corresponding subrepositories.
Push the unclassified_subsystem and common_files to the out-sourcer, pulling back unclassified_subsystem as required, pushing out new common_files repositories as required.
Maintaining History:
I would like to maintain the commit history, as much as practical, for all of the subsystems.
To do this, I will run the hg convert extension command three times, once for each subrepository. I will filter down to only the files that belong in each subrepository. I may also need to map filenames to move the files from ./common_files/foo.py to ./foo.py (for example).
My Questions:
1) Is dividing up a repository into subrepository a reasonable way of implementing security - viz. that a third-party can only see and edit some of our files?
2) Is using hg convert a reasonable way to create a subrepository from an existing repository, while still maintaining the history?
3) Will hg convert's filter strip out (a) all commits messages about files NOT in the filtered respository? Will it filter out all diffs for files NOT in the filtered repository?
There is another implied question: Am I heading into a world of hurt? If so, I will simply give up on retaining file histories, or even make them seperate repositories and forget about cross-repository commits.
I've not used subrepos so far, but I can answer 2) and 3):
2) Yes, sounds reasonable.
3) Yes.
There was a similar question just 2 days ago: Convert mercurial repository to subrepositories with full history (like hg log -f)

hg automatic merges with diff3

Consider a Hg project with two central branches/clones - e.g. DEV and PRD. When someone pushes a hotfix to PRD, an automated script on the central server goes to DEV and pulls the new changes. It then tries to merge the hotfix into DEV.
The problem is that the merge tool integrated into Hg is terrible - as soon as there are parallel changes to the same file it will fail. Take the following example:
parent:3,7c
four
five
six
seven
child1:3,7c
four
five5
six6
seven7
child2:3,7c
fourmore
five
six
more
seven
As you can see, there are no actual conflicts here. If we do the merge locally with kdiff, it solves this simple case without user input!
I'd like a way to get the central server to manage these cases. I thought of using kdiff3 in silent mode, but I can't install kdiff3 on it (it's a CLI only system that we don't even have admin access to), but maybe there is a way to plug diff3 into hg merge so it can resolve simple cases like this? I tried setting "[ui] /n merge = diff3" in the hgrc, but it just spits out the three versions to stdout. Am i missing some additional configuration? Or is there an easier/better tool?
Thanks a lot
To use diff3 as your merge tool, you need to add
[merge-tools]
diff3.args = $local $base $other -m > $output
to your configuration file. You can set the priority if needed, see the wiki. You'll also find more complicated recipes for using diff3 there.
However, I tested how diff3 treats your scenario with edits on adjacent lines, and just like Mercurial, it also refused to merge this cleanly. Different merge tools have different thresholds for what they consider a "conflict" and it seems that KDiff3 is more forgiving than Mercurial and diff3.
I suggest that you let the developer who makes the hotfix on PRD be responsible for merging it into DEV locally where he has access to tools like KDiff3. Automated merging on a server is generally considered to be bad — merges should be verified just a little by a human before you commit them. It's not much extra work after having made the hotfix.

mercurial temporarily ignore versioned files

My question is essentially the same as here but applies to mercurial. I have a set of files that are under version control, and one save operation changes quite a lot of files. Some of the resulting changes are important for revision control, and some of the changes are just junk. I can "partition" off the junk into separate files. These junk files need to be part of a basic checkout in order for it to work, but their contents (and changes over time) aren't that important for revision control. Right now I just tell all our developers not to commit these files, but we all forget and it creates a lot of extra baggage in the repository. I don't really like the svn solution proposed because there are quite a lot of files and I want a simple clone to just work without all this extra manual work, so I was wondering if mercurial has a better alternative. It's kind of like hg shelve but not quite, and kind of like ignore, but not quite. Is there some hg extension that allows for this? Can git do it?
Mercurial doesn't support this. The correct way to do it is to commit thefile.sample and then have your developers (or better you deploy script) do a copy from thefile.sample to thefile if thefile doesn't exist. That way anyone can update the example file, but there's no risk of them committing their local changes (say their personal database connect string).
Aha! So TortoiseHG's repository and global settings have an Auto Exclude List where you can define a list of files that will be unchecked by default when the status, commit, and shelve dialogs open. So they still show up, but the user has to check them in order to actually do a commit. The setting is stored in hgrc, but it's under the [tortoisehg] heading so it's not supported by mercurial per se. Nevertheless, it fits my needs.
One solution to this is to use nested tree support (submodule in git), where the "junk" would be put in a different repository (to avoid cluttering the main repo), while enabling checking out the whole thing out in a consistent manner (right version of both repos in sync).
https://www.mercurial-scm.org/wiki/Subrepository?action=show&redirect=subrepos
In git, submodules are one solution to this issue - but they are not that great UI-wise. What I do instead is to keep two completely independent repositories, and using the subtree merge strategy when I need to update the main repo with the junk repo: http://progit.org/book/ch6-7.html

How do I configure Mercurial to not commit specific config files?

My team is switching to Mercurial. Our projects all have a config file (web.config or app.config, and a few bat files as well - we are a C# shop). These files need to be part of the repository. When a developer clones the repository, local changes are needed to their config files to get them working. For example, a project's config file may need a connection string to the developer's database, or other environment-specific info. We don't want these changes ending up in the repository. And from time to time we do make changes to these configs that do need to get into the repository and distributed to the team and eventually the customer.
What is the easiest way for us to configure or use Mercurial so that these files are not getting committed by accident? I would like to be forced to make an explicit commit of such files, yet merges from the repo would automatically come down in updates.
This has to be a problem someone else has faced, but as Mercurial newbies we are all at a loss for the best solution.
Edit:
A similar question that may share some common solutions, but is not the same as this question, can be found at: Conditional Mercurial Ignore File
I am including this in case that other question might provide the answer you are looking for.
The typical way to handle this is to store templates for the configuration files in your repositor, and add the actual configuration files to the ignore list in Mercurial.
This way, you have pristine, unmodified, copies of each configuration files available at all times, even for new developers who clone from scratch, but in order to make the configuration files usable, you need to make a local copy of it to the actual configuration file name, and modify the file. You could also use compare/merge programs, such as Beyond Compare, to compare a new version of the template file with your local copy of an older version, to see what changed, and add in the missing bits.
If you need to hard prevent committing the actual configuration files, you need a pre-commit or pre-push hook that does this.
In your .hg/hgrc file do this:
[defaults]
commit = -X Projectname/web.config
(assuming "ProjectName" is the project subdir)
Edit:
Also, if you're using Tortoise HG - add this as well:
[tortoisehg]
ciexclude = Projectname/Web.config,Projectname/App_Data/DBFile.mdf
(by the way mind the FORWARD slash in folder-path! Even on Windows!)

Mercurial (Hg) and Binary Files

I am writing a set of django apps and would like to use Hg for version control. I would like each app to be independent of the others so in each app there may be a directory for static media that contains images that I would not want under version control. In other words, the binary files would not all be in one central location
I would like to find a way to clone the repository that would include copies of the image files. It also would be great if when I did a merge, if there were an image file in one repo and not another, that there would be some sort of warning.
Currently I use a python script to find images and other binary files that are in one repo, but not the other. But a lot of people must face this problem, so there must be a more robust and elegant solution.
One one other thing...for reasons I do not want to go into, usually one of my repos is on a windows machine, and the other is on Linux. So a crossplatform solution would be nice.
Since Mercurial 2.0 the extension largefiles is now included in the main distribution. That extension keeps and manages large files outside of the "normal" repository in a way that you get the benefit of DCVS but without the benefit of exponential size and processing time growth.
Other extension that work along similar lines are SnapExtension and BigFilesExtension. However, those two are not distributed with Mercurial (you have to get them manually).
Mercurial can track any kind of file, for binary files if something changes then the whole file gets replaced not just the changes.
On the getting a warning if one repo doesn't contain a file, that's kind of the point of a DVCS is that the repos are related but are autonomous. You could always check and see what files were added during a synch or merge operation.
The current Mercurial book (by Bryan O'Sullivan) says, that Mercurial stores diffs also for binary files. How efficient this is, obviously depends on the nature of changes to binary files.