How can I add complete binaries to a Mercurial patch? - mercurial

I want to use Mercurial to capture changes made to the vanilla installation of a piece of software we use. Everytime we upgrade the software, we need to manually edit the various configuration files and add 3rd party libraries that we use in the current version of the software. Creating patches for the configuration files changes are fine, but how do I add 3rd party libraries (binaries) to a Mercurial patch? Is it even possible?

If you were to try to get the patch for the 7th revision...
hg export --git -r 7 -o 7.patch

Yes, the mq extension can handle binary data just as well as textual data. It will use Git's extended patch format to save the binary data. This is transparently handled for you when you refresh a patch with modified binary files.
Whether or not this is a good idea is another question — VonC is correct when he writes that this is not the normal use case for a version control system.

Even if it may be possible, it is not advisable! (for Mercurial or any other VCS)
A Version Control System is not made to record binaries (mainly because it quickly grows out of proportion, take a all lot of disk space, and has no efficient way to be stored in delta)
You should record the configuration need for each version you tag.
That can be a text file, or a maven pom for instance. Anything that allow an external mechanism (like maven) to download and locally store for you the right dependencies.
That means your patch will include changes to that text file (pom for instance), as well as the rest of the code modifications.

Related

How to use Mercurial's LargeFiles extension?

I use Mercurial for game development, and I'm trying to use the LargeFiles extension included in Mercurial 2.0 to keep track of large binary assets. Unfortunately there isn't a whole lot of documentation on the extension, so I'm not sure how people are expected to use it.
For example, is there any way to safely clean out the .hg/largefiles directory? If I'm on the tip revision, and expect to always have internet access, then I don't need the old versions of largefiles cluttering up the repository, since that's the whole point of using the LargeFiles extension.
Also, how do I have more fine-grained control over where the largefile store is? I can only assume that it's created somewhere on the computer that ran hg init, but I have no idea about the details.
Thanks!
I don't have any guidance on how to safely clean out the .hg/largefiles directory.
Largefiles Store
The largefiles store seems to be stored, by default, at the one of following locations:
Windows: C:\Users\Username\AppData\Local\largefiles
OSX: /Users/username/Library/Caches/largefiles
Linux: (This is my best guess)
/home/username/largefiles
or /home/username/.cache/largefiles
User Configured:
This, however, can be changed in the global settings file using the usercache setting as follows:
[largefiles]
usercache = c:\path\to\largefiles\cache\
Note: This is not documented yet. This makes me wonder if it is subject to change.
Sources:
Largefiles Extension Documentation
User cache paths - https://www.mercurial-scm.org/repo/hg/file/41453d55b481/hgext/largefiles/lfutil.py (lines 84-103)
Undocumented largefiles.usercache setting - https://bz.mercurial-scm.org/show_bug.cgi?id=3088
I'm just posting this for anyone else coming into the thread from a search.
There's currently an issue using the largefiles extension in the mercurial python module when hosted via IIS. See this post if you're encountering issues pushing large changesets (or large files) to IIS via TortoiseHg.
The problem ultimlately turns out to be a bug in SSL processing introduced in Python 2.7.3 (probably explaining why there are so many unresolve posts of people looking for problems with Mercurial). Rolling back to Python 2.7.2 let me get a little further ahead (blocked at 30Mb pushes instead of 15Mb), but to properly solve the problem I had to install the IISCrypto utility to completely disable transfers over SSLv2.

Disable file history for a particular set of files in Mercurial

I understand that in mercurial you can never remove a history for a file unless you do something like this. Is there any way to disable history for certain files from ever being created?. If any other repository system is capable of doing that, please put that down as well.
Why would I want that? Well, in our build system, new binaries are constantly being committed which the non-programmers can use to run the program without compiling every time (the compilation is done by the build system). Each time new binaries are committed, the old ones are useless as far as we are concerned. It is unnecessarily taking up space. If the new binary messes up by any chance, we can always revert back to older source and rebuild (assuming there is a way to disable history for specific files).
As you found out, you cannot do what you want directly in Mercurial.
I suggest you put the binaries somewhere else -- a Subversion subrepo would be a good choice. That way you will only download the latest version of each file on the client, but you will have all versions on your server (where it should be easy to add more disk space).

Mercurial (Hg) and Binary Files

I am writing a set of django apps and would like to use Hg for version control. I would like each app to be independent of the others so in each app there may be a directory for static media that contains images that I would not want under version control. In other words, the binary files would not all be in one central location
I would like to find a way to clone the repository that would include copies of the image files. It also would be great if when I did a merge, if there were an image file in one repo and not another, that there would be some sort of warning.
Currently I use a python script to find images and other binary files that are in one repo, but not the other. But a lot of people must face this problem, so there must be a more robust and elegant solution.
One one other thing...for reasons I do not want to go into, usually one of my repos is on a windows machine, and the other is on Linux. So a crossplatform solution would be nice.
Since Mercurial 2.0 the extension largefiles is now included in the main distribution. That extension keeps and manages large files outside of the "normal" repository in a way that you get the benefit of DCVS but without the benefit of exponential size and processing time growth.
Other extension that work along similar lines are SnapExtension and BigFilesExtension. However, those two are not distributed with Mercurial (you have to get them manually).
Mercurial can track any kind of file, for binary files if something changes then the whole file gets replaced not just the changes.
On the getting a warning if one repo doesn't contain a file, that's kind of the point of a DVCS is that the repos are related but are autonomous. You could always check and see what files were added during a synch or merge operation.
The current Mercurial book (by Bryan O'Sullivan) says, that Mercurial stores diffs also for binary files. How efficient this is, obviously depends on the nature of changes to binary files.

How good is my method of embedding version numbers into my application using Mercurial hooks?

This is not quite a specifc question, and more me like for a criticism of my current approach.
I would like to include the program version number in the program I am developing. This is not a commercial product, but a research application so it is important to know which version generated the results.
My method works as follows:
There is a "pre-commit" hook in my .hg/hgrc file link to version_gen.sh
version_gen.sh consists solely of:
hg parent --template "r{rev}_{date|shortdate}" > version.num
In the makefile, the line version="%__VERSION__% in the main script is replaced with the content of the version.num file.
Are there better ways of doing this? The only real short coming I can see is that if you only commit a specfic file, version.num will be updated, but it won't be commited, and if I tried to add always committing that file, that would result in an infite loop (unless I created some temp file to indicate I was already in a commit, but that seems ugly...).
The problem
As you've identified, you've really created a Catch-22 situation here.
You can't really put meaningful information in the version.num file until the changes are committed and because you are storing version.num in the repository, you can't commit changes to the repository until you have populated the version.num file.
My solution
What I would suggest is:
Get rid of the "pre-commit" hook and hg forget the version.num file.
Add version.num to your .hgignore file.
Adjust version_gen.sh to consist of:
hg parent --template "r{node|short}_{date|shortdate}" > version.num
In the makefile, make sure version_gen.sh is run before version.num is used to set the version parameter.
My reasons
As #Ry4an suggests, getting the build system to insert revision information into the software at build time, using information from the Version Control System is a much better option. The only problem with this is if you try to compile the code from an hg archive of the repository, where the build system cannot extract the relevant information.
I would be inclined to discourage this however - in my own build system, the build failed if revision information couldn't be extracted.
Also, as #Kai Inkinen suggests, using the revision number is not portable. Rev 21 on one machine might be rev 22 on another. While this may not be a problem right now, it could be in the future, if you start colaborating with other people.
Finally, I explain my reasons for not liking the Keyword extension in a question of mine, which touches on similar issues to your own question:
I looked at Mercurials Keyword extension, since it seemed like the obvious solution. However the more I looked at it and read peoples opinions, the more that I came to the conclusion that it wasn't the right thing to do.
I also remember the problems that keyword substitution has caused me in projects at previous companies. ...
Also, I don't particularly want to have to enable Mercurial extensions to get the build to complete. I want the solution to be self contained, so that it isn't easy for the application to be accidentally compiled without the embedded version information just because an extension isn't enabled or the right helper software hasn't been installed.
Then in comments to an answer which suggested using the keyword extension anyway:
... I rejected using the keyword extension as it would be too easy to end up with the string "$Id$" being compiled into the executable. If keyword expansion was built into mercurial rather than an extension, and on by default, I might consider it, but as it stands it just wouldn't be reliable. – Mark Booth
A don't think that there can be a more reliable solution. What if someone accidentally damages .hg or builds not from a clone but from an archive? – Mr.Cat
#Mr.Cat - I don't think there can be a less reliable solution than the keywords extension. Anywhere you haven't explicitly enabled the extension (or someone has disabled it) then you get the literal string "$ID$" compiled into the object file without complaint. If mercurial or the repo is damaged (not sure which you meant) you need to fix that first anyway. As for hg archive, my original solution fails to compile if you try to build it from an archive! That is precisely what I want. I don't want any source to be compiled into our apps without it source being under revision control! – Mark Booth
What you are trying to do is called Keyword Expansion, which is not supported in Mercurial core.
You can integrate that expansion in make file, or (simpler) with the Keyword extension.
This extension allows the expansion of RCS/CVS-like and user defined keys in text files tracked by Mercurial.
Expansion takes place in the working directory or/and when creating a distribution using "hg archive"
That you use a pre-commit hook is what's concerning. You shouldn't be putting the rest of version_gen.sh into the source files thesemves, just into the build/release artifacts which you can do more accurately with an 'update' hook.
You don't want the Makefile to actually change in the repo with each commit, that just makes merges hell. You want to insert the version after checking out the files in advance of a build, which is is what an update hook does.
In distributed systems like Mercurial, the actual "version number" does not necessarily mean the same thing in every environment. Even if this is a single person project, and you are really careful with having only your central repo, you would still probably want to use the sha1-sum instead, since that is truly unique for the given repository state. The sha1 can be fetched through the template {node}
As a suggestion, I think that a better workflow would be to use tags instead, which btw are also local to your repository until you push them upstream. Don't write your number into a file, but instead tag your release code with a meaningful tag like
RELEASE_2
or
RELEASE_2010-04-01
or maybe script this and use the template to create the tag?
You can then add the tag to your non-versioned (in .hgignore) version.num file to be added into the build. This way you can give meaningful names to the releases and you tie the release to the unique identifier.

What exactly does the word Patch mean when referring to 'submitting a patch'?

What exactly does the word patch mean when referring to 'submitting a patch'?
I've seen this used a lot, especially in the open source world. What what does it mean and what exactly is involved in submitting a patch?
It's a file with a list of differences between the code files that have changed. It's usually in the format generated by doing a diff -u on the two files. Most version control systems allow the easy creation of patches but it's generally in that same format.
This allows the code change to be easily applied to someone else's copy of the source code using the patch command.
For example:
Let's say I have the following code:
<?php
$foo = 0;
?>
and I change it to this:
<?php
$bar = 0;
?>
The patch file might look like this:
Index: test.php
===================================================================
--- test.php (revision 40)
+++ test.php (working copy)
## -3,7 +3,7 ##
<?php
- $foo = 0;
+ $bar= 0;
?>
Richard Jones, a developer at Red Hat, has a nice little primer on submitting code to open source projects which covers making and submitting patches.
A patch is usually a file that contains information how to change something (very often to fix a bug, but could also be an enhancement). There are different kind of patches.
A source code patch contains information how one or multiple source code files need to be modified. You can easily generate them using the diff command and you can apply them using the patch command (on Linux/UNIX systems these commands are standard).
However, there are also binary patches. A binary patch contains information how certain bytes within a binary need to be changed. Binary patches are, of course, rare in the OpenSource world, but in the early days of computers I saw them a lot to modify shipped binaries (usually to work around a bug).
Submitting a patch means you have locally fixed something and now you send the file to someone, so he can apply this patch to his local copy or to a public copy on the web, thus other users can benefit of the fix.
Patches are also often used if you have some source code that almost compiles on a certain platform, but some tiny changes are necessary to really have it compile there. Of course you could take the source, modify it and offer the modified code for download. But what if the original source changes again (e.g. bugs get fixed or small enhancements were added)? Then you had to re-download the source, apply the changes again and offer the new modified version. It's a lot of work to keep your modified source up-to-date. Instead of modifying, you create a diff between the original and your modified copy and store it on your server. If now a user wants to download and compile the app from source, he can first download the latest & greatest version of the original source, then apply your patch (so it will compile) and always has the latest version, without you having to change the patch. A problem will only arise if the original source has been changed exactly in one of the places your patch modifies. In this case the system will refuse to apply the patch and a new patch needs to be made.
A patch is a file containing all of the necessary information to turn the maintainer's source tree in to your own. It's usually created by tools like diff or svn diff or git format-patch.
Traditionally, open-source projects accept submissions from normal schlubs in the form of patches so they don't have to give others commit access to their repositories.
A patch, ususally in the form of a .patch file, is a common flat file format for transmitting the differences between two sets of code files. So if you are working on an open source project, and make code changes to files, and want to submit that to the project owner to be checked in (usually because you don't have checkin rights), you would do so via a patch.
WinMerge has this functionality built in, as do many other tools like TortoiseSVN.
A patch file represents the difference between existing source and source you've modified. It is the primary means of adding features or fixing bugs in many projects.
You create a patch using the diff command (for example).
You can then submit this patch to the development mailing list and if it received well, then a committer will apply the patch (thus automatically applying your changes) and commit the code.
Patches are applied using the patch command.
Generally it implies submitting a unified diff file with the aggregate changeset for a feature. You can read more about patches on Wikipedia. Several version control systems (svn, git, etc.) will create a patch file for you based on a changeset.
1. n. A temporary addition to a piece of code, usually as a quick-and-dirty
remedy to an existing bug or misfeature. A patch may or may not work, and may or may not
eventually be incorporated permanently into the program. Distinguished from a diff
or mod by the fact that a patch is generated by more primitive means than the rest
of the program; the classical examples are instructions modified by using the front
panel switches, and changes made directly to the binary executable of a program
originally written in an HLL. Compare one-line fix.
See the entire definition in the jargon file here
Patch is also used in the act of updating system binaries. Microsoft sends out patches all the time but they aren't source code. They are .msp files that install improved binaries. As with all computer science terms, patch is overloaded.
I've always believed the term meant a bug fix, like a knee patch Mom used to put on your holey jeans.