What is maximum file size limit in mercurial

What is maximum file size limit in mercurial - mercurial

Currently, I cant clone the mercurial directory due to following error
Abort: stream ended unexpectedly.
We have few files that are larger than 10MB in size. These files are already uploaded on mercurial repository but we are getting error while doing a clone of that directory. We also have checked our internet connection which is not the issue. please guide what is the maximum size mercurial can transfer.
Kind Regards

We've had this issue when hosting the "master" repository on a shared host.
The hosting company had routines in place which would kill any processes using too much memory, and it seems hgweb loads most of the repository in memory during cloning. Thus if the timing was right, hgweb would get killed in the middle of the cloning operation, producing the error message you posted on the client.
We've moved our "master" repository to Bitbucket for now.

If it's an abruptly aborted stream it's not any limitation that Mercurial is imposing -- that would come with a clear error message. What server do you have hosting hgweb? Are you using Apache or another http server? Are you going over ssh? This is more likely trouble at your network level than it is a Mercurial configuration issue -- and certainly it's not a fundamental Mercurial limitation.

Generally the limits are in Gb area and defined by the operating system rather than mercurial (https://www.mercurial-scm.org/wiki/HandlingLargeFiles).
However, you repository might have a hook configured to limit the binary file size. See for example https://www.mercurial-scm.org/pipermail/mercurial/2009-January/023322.html
So you need to check the configuration of your repository in .hg/hgrc

Related

How Does Mercurial Communicate Changes Between Clients?

Forgive me if this question sounds ridiculously naive, but I don't seem to be able to find a straightforward answer to this anywhere.
I am considering using Mercurial for source control on a small (2-3 developers) project. I like the idea of not having to subscribe to a central repository, and I like that everyone effectively has a complete copy of the project. What I don't understand is how the Mercurial clients communicate changes to each other. Does it require opening a specific port or something similar?
Any pointers to help on Mercurial or comments from people who have used it would be gratefully received.

Mercurial can access remote repositories in a number of ways. The main ones are:
File access (i.e. look at the repo at this path)
HTTP / HTTPS (i.e. look at the repo being served at this web address)
SSH (i.e. log into this machine / user, and look at the repo at this path)
The first only really works if your team is all working on a shared file-system, but is also useful if you personally have multiple clones that you want to move information between.
HTTP is how you would access something like Bitbucket, or some other "available to all" server. You can set-up a temporary server on a machine with the command hg serve, and that's useful if you need to share with a colleague quickly and don't care about security, but normally HTTP Mercurial servers sit behind Apache, or some other web server software.
An SSH based set-up is just a machine running an SSH server which has Mercurial installed. Mercurial logs on to the machine, and invokes hg remotely. The local hg and the remote hg then talk over the link. Very easy to set-up in my opinion. Credentials are all handled in normal SSH ways.
Further reading:Publishing Repositories

Is there a way to log everything that happens in Mercurial?

I'd need to keep a record of everything that happens with some repositories. By default I can keep a record of merging and commits, but I'd need a record of cloning, pulls, authentication, etc.
Is there a way to keep this logged somewhere?

Cloning, pulling, and authenticating are not part of a repository's history or state - all of that data is intentionally not tracked or pushed around by Mercurial.
In fact, Mercurial doesn't handle this behavior itself, it offloads this (particularly authentication) to the web server handling such requests. What you can do is look at the logs the server records. How this is tracked is very much server specific, but essentially you'd look at the access logs and see what is being requested.
Mercurial provides a lightweight web server hg serve for handling limited numbers of requests, and you can configure where both access and error logs are written to by this server with the -A and -E flags respectively. See hg help serve for more.
You could play with Mercurial's hooks and maybe log clones and pulls via that method, but authentication is completely transparent to Mercurial, so logging that must come from the server.

How are people accessing the repository. If it's over ssh you could easily log the command line that comes in via ssh. If it's over HTTP you can use your weblogs as a pretty fair record.
Mozilla has an open source 'pushlog' (example: https://hg.mozilla.org/mozilla-central/pushloghtml ) that they use to record all writes. You could probably easily tweak that code to record all reads too.

Is system-wide Mercurial installation enough in a shared enviroment?

I am learning how I can install Mercurial on our team system, but I am not experienced enough to make some decision.
For our team, we have a server machine used as a repository. Every team member also has her/his own Linux RedHat installed machine. However, we do not do anything on our local terminals and we do everything on the server. Every member has a user directory on the server such as /home/Cassie, /home/john, ... and we save all our code and work there. When we turn on the local terminals, the GNOME system shows our personal files on the server not the local machine. Whenever everyone click the terminal application on desktop, it connects to her own home directory. Thus, we do not need to use SSH command to connect to the server. It is like the school multi-users system. Everyone has a user account and she logs into her own account to do her own work. I hope I can install a shared repository on that server and every one can do push, pull, etc. all kind of commands there.
1) Since we use a shared environment, does it mean that I need to install Mercurial on only the server and that is enough for everyone to do "commit", "push", "pull", etc. commands?
2) By installing only system-wide Mercurial, does it eliminate the ability to do local commit? If I would like to let everyone still have the "local commit" ability, how should I do it?
3) I have searched online. Some people mentioned that for a shared network server, it is impossible to have locks for any two users if they are trying to access the same file at the same time. Does it imply my situation?
In sum, we do all the work on the server. I hope to find a plan to have Mercurial control on a repository shared by everyone when everyone still has local commit ability and the repository still has some locks protection if any two users try to access a file at the same time. If this scenario is feasible, can I just install the Mercurial on the server or I need to install Mercurial for both servers and users machines? If it is impossible for the scenario, would someone please suggest me a plan to have version control for our system?

1) Since we use a shared environment, does it mean that I just need to install the Mercurial on the server and it is enough for everyone to do "commit","push","pull"..etc commands ?
If your users are logging into a shell on the server in order to do their work, then yes it is sufficient to have Mercurial installed only on the server.
2) By installing only system-wide Mercurial, does it eliminate the ability to do local commit ? If I would like to let everyone still have the "local commit" ability, how should I do it ?
Your users will presumably checkout from a shared "root" repository into their own home directory in order to work on the code. They will have a "local" copy of the repo in their home directory and will push into the shared root repository.
3) I have searched online. Some people mentioned that for a shared network server, it is impossible to have locks for any two users if they are trying to access the same file at the same time. Does it imply my situation ?
As long as your users are working within their own local copies of the repo, they will not interfere with one another. The only time a conflict may arise is when committing back to the shared root repository -- in which case the user will need to merge their changes and resolve any conflicts.
I would recommend reading carefully through Joel Spolsky's excellent Hg Init tutorial for a better understanding of how Mercurial handles "central" and "local" copies.

Many binary files synchronization

I have about 100 000 files on office server (images, pdf's, etc...)
Each day files count grows about 100-500 items, and about 20-50 old files changes.
What is the best way to synchronize Web-server with these files?
Can any system like Mercurial, GIT help?
(On office server, I'll commit changes, and web-server periodically do updates)?
Second problem is, that on Web-server I have user-generated-content (binary-files) (other files).
Each day this users upload about 1000-2000 new files. Old files don't change.
And I need to backup these files to local machine.
Can any system like Merurial, GIT help in this situation?
(On web-server I'll commit these files by cron, and on local machine I'll do updates)
Thanks
UPD.
Office server is Windows Server 2008 R2
Web-server is Debian 5 lenny

The simplest and most reliable mechanism (in my experience) is rsync.
On Windows, however, rsync over ssh is badly broken due to issues with how Cygwin interacts with named pipes. Rsync over its own protocol works (as long as you don't care about encryption), but I've had lots of problems getting rsync to stay up as a Windows service for more than a few days at a time. DeltaCopy is a Windows app that uses the rsync tools behind the scenes; it seems to work very well, though I haven't tried the ssh option.

A DVCS is not a good solution in this case: it will keep the all history, which you don't always need, and will make any clone a massive operation.
An artifact repository like Nexus is much more adapted if you need some kind of versioning with integrity check associated with your binaries.
Otherwise (no versioning), a simple rsync like Marcelo proposes is enough.

Mercurial local repository backup

I'm a big fan of backing things up. I keep my important school essays and such in a folder of my Dropbox. I make sure that all of my photos are duplicated to an external drive. I have a home server where I keep important files mirrored across two drives inside the server (like a software RAID 1).
So for my code, I have always used Subversion to back it up. I keep the trunk folder with a stable copy of my application, but then I create a branch named with my username, and inside there is my working copy. I make very few changes between commits to that branch, with the understanding that the code in there is my backup.
Now I'm looking into Mercurial, and I must admit I haven't truly used it yet so I may have this all wrong. But it seems to me that you have a server-side repository, and then you clone it to a working directory in the form of a local repository. Then as you work on something, you make commits to that local repository, and when things are in a state to be shared with others, you hg push to the parent repository on the server.
Between pushes of stable, tested, bug-free code, where is the backup?
After doing some thinking, I've come to the conclusion that it is not meant for backup purposes and it assumes you've handled that on your own. I guess I need to keep my Mercurial local repositories in my dropbox or some other backed-up location, since my in-progress code is not pushed to the server.
Is this pretty much it, or have I missed something? If you use Mercurial, how do you backup your local repositories? If you had turned on your computer this morning and your hard drive went up in flames (or, more likely, the read head went bad, or the OS corrupted itself, ...), what would be lost? If you spent the past week developing a module, writing test cases for it, documenting and commenting it, and then a virus wipes your local repository away, isn't that the only copy?
So then on the flip side, do you create a remote repository for every local repository and push to it all the time?
How do you find a balance? How do you ensure your code is backed up? Where is the line between using Mercurial as backup, and using a local filesystem backup utility to keep your local repositories safe?

It's ok thinking of Subversion as a 'backup', but it's only really doing that by virtue of being on a separate machine, which isn't really intrinsic to Subversion. If your Subversion server was the same machine as your development machine - not uncommon in the Linux world - you're not really backed up in the sense of having protection from hardware failure, theft, fire, etc. And in fact, there is some data in that case that is not backed up at all - your current code may exist in two places but everything else in the repository (eg. the revision history) only exists in one place, on the remote server.
It's exactly the same for Mercurial except that you've taken away the need for a separate server and thus made it so that you have to explicitly think about backing up rather than it being a side-effect of needing to have a server somewhere. You can definitely set up another Mercurial repository somewhere and push your changes to that periodically and consider that your backup. Alternatively, simply backup your local repository in the same way that you'd back up any other important directory. With you having a full copy of the repository locally, including all revision history and other meta data, this is arguably even more convenient and safe than the way you currently do it with Subversion.

The "hidden" .hg directory stores all of the local commits. You can back up this directory using a standard backup program.

The changes get to the remote directory only when you push. Commits stay local, but you get them if you clone your repository. Then, yes, if you want your things to get to the server repository you have to push to it "all the time".
On the other hand, nothing stops you to have several machines and push content from one to another. Every mercurial repository can turn itself into a server in a matter of seconds typing "hg serve".
I'm not sure it really answer to you question, but I too am a big fan of backup and manage things this way with many clones of my repository (I also use massively mq to work in patch mode but that's another story).
PS: as a sidenote, I'm considering to use mercurial as a tool for filesystem backup. The only thing that bother me is that for this purpose I would prefer to disable the diff feature and treat all files as binary, but that should be easy.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008