We have a main repository of code, separated into subfolders such as SQL, Websites, etc. We are looking to add an additional folder for our SSIS Packages and similar items so that we can keep backups. However, they are stored in folder structures external from our code repository in three separate servers. We need to maintain the original folder structure on these servers containing the SSIS packages.
Normally I would put Mercurial on each of machines hosting files that were to be added to the repository, pull down a copy of our repo, and add the files to the repo. However, in this case I won't be able to move the files into the repo folder after I pull down a copy, because the directory structure needs to be maintained.
The only alternative I can think of is to create a second repository for the SSIS packages, but this still runs into the issue that there are three servers with SSIS files to be added, all with their own unique folder structure. Which means that we would have to either change the directory structure of the three servers to match each other, or create three separate repositories.
Please note that the goal here is not for "installing" these files to the servers - the files are already on the intended servers and are (rarely) modified locally when required - it's purely for backing up the versions and maintaining a minimal number of mercurial repositories.
Is it possible to maintain a unique individual folder structure on a local machine while committing files to Mercurial?
An example folder structure was requested in the comments:
Current Repo structure:
Repo
Documentation
Reports
SQL
Applications
Application A
Application B
The goal is to add a new folder so it will look like this:
Repo
Documentation
SSIS
Server 1
Server 2
Reports
However, each server stores the files in different directory structures (directly dropped in the base C drive, one in a separate sub directory, and the third deep in a legacy Visual Source Safe directory), and is referenced externally so changing them to match the repo is sub optimal.
Hence the question - is there a way to add files to a mercurial repo without dropping them into a pulled copy of the repo?
First of all, what you're trying to do doesn't sound like a great idea to me. I think you can do what you want to do using symlinks in the file system.
Mercurial knows how to manage symbolic links, and you could make a tree of symbolic links for each machine pointing back to the real files in a different
tree.
So, it's possible, but it seems like a very error prone process to me. I think
you're far better off using an install tool and putting the "Makefile" or rake tasks, or ... in your repo.
Related
Here's my scenario: When I initially created my Mercurial repo, I used hg add to add all *.pl *.sh and *.sql scripts to the repo. I later learned how to use the .hgignore file to exclude other files from the repo. One of the files I needed to exclude was a *.sql file that is generated by a script, so it is essentially a data file that constantly changes when the script runs that produces it; thus, I added it explicitly to the .hgignore file a few revisions ago.
Today, I want to update to a prior revision before this *.sql file was added to the .hgignore, so that I can create a branch off of it. However, when I try to update the working directory to this prior version, I get the following error:
a.sql: untracked file differs
abort: untracked files in working directory differ from files in requested revision
I know that one way I could get around this problem is to delete the file before trying to update to the prior revision, either by manually deleting it or using hg update --clean --check.
That may work in this particular case, since the file is auto-generated by a script each time, and so I don't care about the data that is currently in it.
However, I'm trying to find out what is the safe way people would generally handle this situation when they decide to ignore a file set (like a set of data files that aren't auto-generated) and need to return to a previous revision before they were marked to be ignored, especially if they wanted to retain the most current content in those file sets while still being able to view earlier revisions of files that Mercurial is actively tracking.
I've also considered that you could backup the files, but I think that is only a reasonable solution if this is a one-off case. If you want the ability to hg update to previous revisions on a frequent at-whim basis, then it becomes quite tedious to backup the data each time before you update to a earlier revision (it's also not a reliable way to guarantee that others may not delete the data that isn't being tracked in the repo).
Thanks for the help.
However, I'm trying to find out what is the safe way people would generally handle this situation when they decide to ignore a file set (like a set of data files that aren't auto-generated) and need to return to a previous revision before they were marked to be ignored, especially if they wanted to retain the most current content in those file sets while still being able to view earlier revisions of files that Mercurial is actively tracking.
It depends.
If you have exclusive control over this repository, and have the practical ability to require everyone to re-clone from it, then you can use hg convert to exclude the files from the old revisions. This is by far the cleanest option, but it will change the revision identifiers (hashes) for those revisions and all of their topological descendants. This is why everyone has to re-clone; their old clones will not interact properly with the new repository.
If you can't do that, you can copy the files somewhere else (you do have backups already, right?), clobber the originals with the old versions, and then restore them from your copy. This has to be done whenever you check the files out, so it is definitely suboptimal. You may be able to make this slightly easier by keeping the files outside the repository and checking in symlinks to the files, but you'll still have to fix up the symlinks whenever you checkout an old version.
However, what you describe is not the normal use case for Mercurial. Typically, untracked files are autogenerated, or at least able to be regenerated from tracked files. The operating assumption is that untracked files are not important and can be discarded at any time. Mercurial doesn't actually do this, because that would be rude, but neither does it make any special effort to preserve them when (for example) you make a bundle of the repository.
If you need to deal with versioning of object files, it is typical to store them in a separate artifact repository or some other system. This can be more difficult to manage because you have to reunite the binaries with the source code when you do a build. But it is much more robust than keeping the binaries loose in the repository and hoping they won't get accidentally overwritten or deleted.
Another option is to collapse the binary to text and then place the text under version control. This is always possible (e.g. take a hexdump) but may or may not be practical or reasonable, depending on the file format. For a compressed file format (e.g. tarballs, most image files, etc.) the hexdump is not going to be any easier to 3-way merge than the original binary, so there's little point in it. Similarly, if the binary is huge, the hexdump will be huge too. On the other hand, if a binary is compiled from source code, it is entirely normal to store the source and discard the binary. For something structured like an SQLite database, you might try storing an SQL script which will generate the database. For a zip file or tarball, store the contents. And so on. All of these things can be regenerated using make or a similar tool whenever you check things out, and you can automate this with a repo hook.
My company use Git for “version control”,etc. Currently it is used for C, C# and Python. I have been asked to add the database schemas together with the more “complex” SQL (no idea when it becomes “complex”) to the repository. Currently the database is backed up after changes have been made to the schemas or after data has been added (at the moment it is purely a development environment). Having looked at Git, database schemas and the like do not really seem (to me) to map onto it. Should I be considering another package for “source control” to compliment the existing MySQL backups?
Thank you...
Assuming you are just wanting to store the SQL scripts that can recreate your DB schema without any data in it (CREATE TABLE, VIEW, INDEX, etc.) then Git seems like a perfectly good option. Git is generally good for version control of textual data, such as SQL scripts.
The fingerprint rule is not to store large files which are often modified in git for several reasons. (out of this answer scope - heuristically, snapshots etc) so i would suggest not to add them to git directly and instead store them in a submodule as a standalone repository.
This way you can still use git to track changes but your git repository will not growth to a huge size (pack files) and you can manage it inside your project.
If you only want to store the sql script git is a good choice sine it will handle it as any other file.
I suspect this might be really obvious but I can't find a straightforward solution in the documentation or forums:
I have written some code that is held in a Mercurial repository on BitBucket.
I use this code to build Linux virtual servers. When I build a server, I clone the repo onto the server, run my build script, and then delete the clone. The result is a configured server with several files from my repo located in various folders on the server.
Now, I'm looking for a mechanism where I can roll out bug fixes and improvements to my users' servers after I have handed them over. At that time, I won't have SSH access to the servers and I cannot expect my end users to do anything more complicated than kick off a cron job or launch a script.
To achieve this, I'm thinking of setting up a BitBucket account for my users with read-only access to my repo.
I have no problem writing a script to clone my repo, via this read-only account, and apply the updates, but I don't want to include all my files. In particular, I want to exclude my build script as it is commercially sensitive. I know I could remove it from my repo, but then my build wouldn't work.
Reading around, it seems I may need to create a branch or a fork of my repo (which?). Or maybe a sub-repo? Then, I could remove the sensitive files from that branch/fork/sub-repo and allow my users to clone it via a script.
That's OK, but I need a way to update the branched/forked/sub repo as I make changes to the main one. Can this be automatic? In other words, can it be set up to always reflect the updates made in the main repo? Excluding the sensitive files of course.
I'm not sure I'd want updates to be automatic though, so I'd also like to know how to transfer updates from the main to the branch/fork/sub manually. A merge? If I do a merge, how do I make sure my sensitive files don't get copied across?
To sum up, I have a main repo which contains some sensitive files and I need a way to roll out updates of all but those sensitive files to my read-only users.
Sorry if this is hugely obvious. I'm sure it's a case of not seeing the wood for the trees and being overwhelmed by the possibilities. Thanks.
I don't think that you need to solve this in Mercurial at all.
What you actually need is Continuous Integration / a build server.
The simplest solution goes like this:
Set up a build server with something like TeamCity or Jenkins, that's always online and monitors changes in your Bitbucket repository.
You can set it up so that when there's a change in your repository, the build server runs your build script and copies the output to some FTP server, or download site, or whatever.
Now you have a single location that always contains the most recent code changes, but without the sensitive files like the build script.
Then, you can set up a script or cron job that your end users can run to get the newest version of the code from that central location.
You are ok with two branches, one for the users clone (main) and other for your main development (dev), the tricky part is merging the new changes from dev to main.
You can solve this by excluding files in the merge process. Excluding a file while merging in Mercurial
By setting the [merge-patterns] section in your .hgrc you can sepcify what files are not affected by the merge.
[merge-patterns]
build.sh = internal:local
For more info read hg help merge-tools.
"internal:local"
Uses the local version of files as the merged version.
Entire Mercurial trees always get moved around together, so you can't clone or pull just part of a repository (along the file tree axis). You could keep a branch that has only part of the files, and then keep another branch that has everything, making it easy to merge the the partial (in terms of files) branch into the other branch (but merging the other way wouldn't be particularly easy).
I'm thinking maybe subrepositories work for your particular use case.
I am (still) trying to completely migrate our company's SVN to HG.
For the most part I've succeeded, but we ran across a problem.
Our codebase has over 30 different projects, each one on its folder.
I've been asked multiple times how to commit and then push specific files to our central repository instead of being forced to commit everything everywhere to then push it, it's certainly annoying. Not being able to pull only specific projects is also an nuisance.
Is there any way to handle this like we used to in SVN? Where we could just commit what we wanted and not everything, and update only what was necessary.
Thank you.
A major difference between SVN and Mercurial is that you should have one repository per project in Mercurial.
You can change your repository to be multiple repositories using the convert extension.
Like Steve Kaye said you should create one repo per project, but as well you may want to create one master repo and include all your projects as subrepos This will allow svn like behavior of getting a copy of everything.
I want to sell a copy of my system and need to transfer the source code to my customers. I use Mercurial as the VCS. There are some confidential data in my code. For example, Amazon access key/secert key, database passwords and ssl private keys. Those keys are written in the code or configuration files, like this:
# settings of Amazon S3 storage
s3.storages:
access_key: <secret>
secret_key: <secret>
Before I transfer my code to them, I need to clean all those confidential data in the code base. But all of them are in history (changesets). With Mercurial, how can I clean those secret?
If you're giving the customers only a snapshot you can do it after you run hg archive.
If you want to give them access to the repository with full history you need to use hg convert to exclude that file.
In that case you're probably better off just invalidating the AWS key and using a new one in the future -- Amazon makes that very easy.
Going forward you're better off not putting those keys into source control. Instead put in a config.sample file and then add config.actual top your .hgignore.