Have Mercurial completely forget a file and history - mercurial

I am trying to have Mercurial forget a file, and all the history of that file. I have been tracking some binary files that have been changing a lot, and so my .hg/store/data folder now consists of 660MB of data that I don't want to keep pushing around.
I have considered just starting a new repository, but then I would lose all my code changes, so I have dismissed this.

You can't change that kind of history directly. Remember that most revision control software is usually designed specifically to prevent this.
If you insist, you might be able to do some cherry picking to create a new repository out of the old one, selecting specific changesets to reflect the history you want to reveal.

Related

Incorporating external changes using Mercurial and TortoiseHg

I'm looking for advice on how best to organize my repository to handle outside changes. And, I have a simple user question on creating a branch in TortoiseHg. This is my first time setting this sort of thing up, so I'm having trouble figuring out the most straightforward way to handle it.
I have a website set up on a service. The theme we're using is based on one of their examples. Occasionally, they'll update the theme we're based on. And, most of the time, I would like to incorporate most of those changes.
The part that is tripping me up is that I don't necessarily want to incorporate all of their changes. So, is the correct way to do it to create two branches? One that is canonical version, straight from them, containing only their edits? And the other is our release branch, that we merge in only portions of their changes?
Will the canonical branch persist? Or, does it come into existence only when they do a new dump, and then I do a manual merge back into my release branch with the changes I want to incorporate?
If it's persistent, is there some way to use TortoiseHg to create the branch back at the root? Or, do I need to dig into the command-line syntax to do that? I know this is a one-time thing for this project. But, I'm looking for advice on how to do this in other software situations, where I want to go back to an earlier version to make a patch. I'm sure there's a tutorial for exactly this situation, but I wasn't able to come up with the correct search phrase to find it. At least, not using TortoiseHg.
If upstream theme is also version-controlled (in any SCM, supported by Mercurial), two long-term branches with grafting some subset of changesets from VENDOR to MY is, naturally, fast and easy way (another possible solution is MQ-patches on top of branch with vanilla theme, but it's slightly harder way)
If original theme not versioned, you can just grab sources from time to time, edit (incorporate own changes, remove unwanted) and commit this own changeset into single-branch repository
is there some way to use TortoiseHg to create the branch back at the root?
I can not recall or find (with quick-search) nice & obvious way to create branch in GUI-way (i.e. big button on toolbar "Branch"), but you can create branch at commit stage (anyway, branch doesn't exist while at least one changeset in this branch isn't committed). In order to create branch, starting from <some old> changeset, you have:
Update to <some old> changeset (RClick - Update...)
Commit (without changing anything), but in commit dialogue use "Branch" button on commit-toolbar (see 1) in order to change old branch-name to some new (see 2)

How to completely change an hg repo's structure

I have an hg (Mercurial) repo located at, say:
http://myhg:5000/projects/fizzbuzz
This fizzbuz directory has the following basic structure:
fizzbuzz/
src/
... thousands of source files
docs/
... lots of docs
tests/
... lots of tests
I am now completely re-engineering the fizzbuzz app. The new app's project structure will be completely different (from the top down) than the existing one:
fizzbuzz/
herps/
foo/
... thousands of foos
bar/
... thousands of bars
derps/
... lots of derps
It's essentially a brand new app. I guess one solution would be to delete the fizzbuzz repo and then create a new one and add my code to the new one. But I was wondering if there's a way to basically tell hg to erase everything in a repo (but not delete the repo), and then add in the new, re-engineered, content. Or some other way to elegantly swap out the new code base for the old. Ideas? Thanks in advance!
Sure, you can wipe a repository by deleting everything and commiting all the changes (all the deletions). If you ever need to restore or review the old code it will still be available by checking out the revisions prior to the deletion. Unless your repository is very large on disk this is probably the way to go--alternatively you can start a new repository for the new version and leave the current one as-is.
In either case deleting all the code and its history is typically unwise.
Take a look at the Convert extension. While it is, by design, not possible to change the history of your current repository, you can, via hg convert, construct a new repository based on the history of an existing repository. This is very useful for scenarios like you describe, where you need to refactor the file-structure to such a degree that the old history is no longer useful.
That said, consider just making the changes directly in your current repository. What actual benefit do you get by rewriting history? Make the changes now, and Mercurial will continue doing it's job of tracking where you came from.

How do you handle documents, images(psd), etc in you repository?

This might be a noob question. but I'm really torned between adding documents to my repository, in this case Mercurial.
by documents i meant, files that doesn't really go into your program. like PSD, doc, xls.
what's the best way to handle those files, or how do you handle your documents.
Take a look at the Largefiles extension that shipped with Mercurial 2.0 (with bugfixes since). It's designed to treat files that are binary and update rarely in a different, more efficient way.
Basically it stores those files without trying to compute diffs between versions, and anybody cloning the repo just gets the versions they need, and not all the history. This leads to faster cloning / pulling, but updates may need a connection to the remote repository to read versions of files into the local cache.
I toss them in my repository. It's nice to track changes of them and see old revisions anyway. I can see old revisions of a design document or see what the previous art was for an asset (maybe a graphic designer removed the alpha channel and he/she wasn't supposed to). Throw it in there. If it doesn't change, it's not taking up any more space with a good source control system than storing it outside of source control.

Are deleted files still downloaded with an hg clone?

In our Mercurial repo we added a really big file (and did an hg push), then deleted the big file (and did another push).
Now if someone does an hg clone will they still pull down that big file? I know it won't appear in their working directory as it was deleted, but will the file still be pulled down and stored in Mercurial internal storage?
I'd like to ensure people don't have to pull down the file. I've learned that really big files should be stored outside of Mercurial, so I deleted the file. But I was wondering if people will still be pulling down the big file - in which case I guess I will recreate the repository from scratch.
Of course it will still be in the repository.
You can always update back to older revisions, and if you update back to the revision you got when you committed the file, it'll be there in all its glory.
There are two ways to mitigate this (when you're committing, not now):
One of the big-files extensions, these essentially add big files to a secondary repository and link the two, so that if you update to a revision where the file doesn't exist, and you don't already have it, it will not get updated. ie. it's more a "on-demand" style of pulling
If the file never changes, keep it available on the network and just create some kind of link to it instead of a full copy
Right now, you got four options:
Strip away the changeset that added the file, and all the changesets that came after it. You can do that using the Mercurial Queues extension. Note that you need to do this stripping in all clones. If just one of your users push back the repository that has that file in its history to the central clone, you have the changesets back.
Rebuild the repository from scratch manually
Using the hg convert command and some filtering, the --filemap option can be used for this
Leave it as is. How big is it, will be much of a problem?
Note that rebuilding the repository, either manually or through hg convert will invalidate all clones. Anyone trying to push to your new central clone from an old clone will get a message about unrelated repositories. If any of your users are stupi^H^H^H^H^Hnot smart enough to realize that forcing the push is a bad idea, then you will have problems with this approach.
Yes, the file is still in the history. If you want to delete it completely, you need to use Mercurial Queues — see Editing History on Mercurial wiki.
Just keep in mind this breaks clones as revision IDs change.

mercurial temporarily ignore versioned files

My question is essentially the same as here but applies to mercurial. I have a set of files that are under version control, and one save operation changes quite a lot of files. Some of the resulting changes are important for revision control, and some of the changes are just junk. I can "partition" off the junk into separate files. These junk files need to be part of a basic checkout in order for it to work, but their contents (and changes over time) aren't that important for revision control. Right now I just tell all our developers not to commit these files, but we all forget and it creates a lot of extra baggage in the repository. I don't really like the svn solution proposed because there are quite a lot of files and I want a simple clone to just work without all this extra manual work, so I was wondering if mercurial has a better alternative. It's kind of like hg shelve but not quite, and kind of like ignore, but not quite. Is there some hg extension that allows for this? Can git do it?
Mercurial doesn't support this. The correct way to do it is to commit thefile.sample and then have your developers (or better you deploy script) do a copy from thefile.sample to thefile if thefile doesn't exist. That way anyone can update the example file, but there's no risk of them committing their local changes (say their personal database connect string).
Aha! So TortoiseHG's repository and global settings have an Auto Exclude List where you can define a list of files that will be unchecked by default when the status, commit, and shelve dialogs open. So they still show up, but the user has to check them in order to actually do a commit. The setting is stored in hgrc, but it's under the [tortoisehg] heading so it's not supported by mercurial per se. Nevertheless, it fits my needs.
One solution to this is to use nested tree support (submodule in git), where the "junk" would be put in a different repository (to avoid cluttering the main repo), while enabling checking out the whole thing out in a consistent manner (right version of both repos in sync).
https://www.mercurial-scm.org/wiki/Subrepository?action=show&redirect=subrepos
In git, submodules are one solution to this issue - but they are not that great UI-wise. What I do instead is to keep two completely independent repositories, and using the subtree merge strategy when I need to update the main repo with the junk repo: http://progit.org/book/ch6-7.html