I'm working on a Mercurial GUI client that interacts with hg.exe through the command line (the preferred high-level API, as I understand it).
However, I am having trouble determining the possible outputs of each command. I can see several outputs by simulating situations, but I was wondering if there is a complete reference of the possible outputs for each command.
For instance, for the command hg fetch, some possible outputs are:
pulling from https://User#server.com/Repo
searching for changes
no changes found
if there are no changes, or:
abort: outstanding uncommitted changes
or one of several other messages, depending on the situation.
I would like to structure my program to handle as many of these cases as possible, but it's hard for me to know in advance what they all are.
Is there a documented reference for the command-line? I have not been able to find one with The Google.
Look through the translation strings file. Then you'll know you have every message handled and be able to see what parts of it vary.
Also, fetch is just a convenience wrapper around pull/update/merge. If you're invoking mercurial programmatically you probably want to keep those three very different concepts separate in your running it so you know which part failed. In your example above it's the 'update' failing, so the 'pull' would have succeeded and the 'update's failing would allow you to provide the user with a better message.
(fetch is an abomination, which is part of why it's disabled by default)
Is this what you were looking for: https://www.mercurial-scm.org/wiki/MercurialBook ?
Mercurial 1.9 brings a command server, a stable (in a sense that API doesn't change that much) and low overhead (there is no need to run hg process for every command). The communication is done via a pipe.
I have a folder with ~10 000 subfolders.
Can any linux API or tool watch for any change in any folder below e.g. /SharedRoot or do I have to setup inotify for each folder? (i.e. I loose if I want to do this for 10k+ folders). I guess yes, since I've already seen examples of this inefficient method, for instance http://twistedmatrix.com/trac/browser/trunk/twisted/internet/inotify.py?rev=28866#L345
My problem:
I need to keep folders time-sorted with most recently active "project" up top.
When a file changes, each folder above that file should update its last-modified timestamp to match the file. Delays are ok. Opening a file (typically MS Excel) and closing again, its file date can jump up and then down again. For this reason I need to wait until after a file is closed, then queue the folder of that file for checking, and only a while later do I go and look for the newest file in its folder, since the filedate of the triggering file could already be back-dated to its original timestamp by Excel or similar programs. Also in case several files from same folder are used/created, it makes sense to buffer timestamping of that folders' parents to at least get a bunch of updates collapsed into one delayed update.
I'm looking for a linux solution. I have some code that can be run on a windows server, most of the queing functionality is here: http://github.com/sesam/FolderdateFollowsFiles/blob/master/FolderdateFollowsFiles/Follower.vb
Available API:s
The relative of inotify on windows, ReadDirectoryChangesW, can watch a folder and its whole subtree; see bWatchSubtree on http://msdn.microsoft.com/en-us/library/aa365465(VS.85).aspx
Samba?
Patching samba source is a possibility, but perhaps there are already hooks available? Other possibilities, like client side (various windows versions) and spying on file activities in order to update folders recursively?
Yes, you need to use inotify, however you need not consume watches on every node immediately.
The process (similar to how beagle does it) is rather simple:
Establish a watch on the root node.
Do a breadth first (not depth first) search starting at the root node
Establish watches on directories, in the order of the search.
Watch for directory create events, continue adding as they do. Re-sort your list as this happens.
The breadth first search is important, otherwise you might miss some stuff due to a race of when you start and what clients of the root node are doing.
See this question, which also mentions this RFQ. I had the same exact problem that you are facing.
In essence, one thread continues to watch for directory create events, adding new watches on new directories almost at the same time that they are created. Something else sorts the list either on demand, or after the inotify thread releases its lock.
I've attempted lock-free versions of the above, but with .. questionable .. success :)
I saw you are running these trees under a Samba share. Maybe you can use the ClamAV virus scanning VFS module for inspiration to see how they trigger the 'scan on close'.
Samba Howto : Stackable VFS Modules
It should be pretty straightforward to check the time of the closed file and modify the directory path leading to it without any of the performance/memory overhead associated with inotify et al.
Just a thought.
Since MathWorks release a new version of MATLAB every six months, it's a bit of hassle having to set up the latest version each time. What I'd like is an automatic way of configuring MATLAB, to save wasting time on administrative hassle. The sorts of things I usually do when I get a new version are:
Add commonly used directories to the path.
Create some toolbar shortcuts.
Change some GUI preferences.
The first task is easy to accomplish programmatically with addpath and savepath. The next two are not so simple.
Details of shortcuts are stored in the file 'shortcuts.xml' in the folder given by prefdir. My best idea so far is to use one of the XML toolboxes in the MATLAB Central File Exchange to read in this file, add some shortcut details and write them back to file. This seems like quite a lot of effort, and that usually means I've missed an existing utility function. Is there an easier way of (programmatically) adding shortcuts?
Changing the GUI preferences seems even trickier. preferences just opens the GUI preference editor (equivalent to File -> Preferences); setpref doesn't seems to cover GUI options.
The GUI preferences are stored in matlab.prf (again in prefdir); this time in traditional name=value config style. I could try overwriting values in this directly, but it isn't always clear what each line does, or how much the names differ between releases, or how broken MATLAB will be if this file contains dodgy values. I realise that this is a long shot, but are the contents of matlab.prf documented anywhere? Or is there a better way of configuring the GUI?
For extra credit, how do you set up your copy of MATLAB? Are there any other tweaks I've missed, that it is possible to alter via a script?
shortcuts - read here and here
preferences - read http://undocumentedmatlab.com/blog/changing-system-preferences-programmatically/
At the moment, I'm not using scripts, though this sounds like a very interesting idea.
Unless there are new features that you also want to configure, you can simply copy-paste the old preferences into the new prefdir. I guess this should be doable programmatically, though you might have to select the old prefdir via uigetdir. So far, this has not created major problems for me. Note also that in case of a major change in the structure of preferences, any programmatic version would have to be rewritten as well.
I'm adding paths at each startup, so that I don't need to think of manually adding new directories every time I change something in my code base (and I don't want to have to update directory structures for each user). Thus, I also need to copy-paste startup.m for each installation.
If I had to do everything manually, I'd also want to change the autosave options to store the files in an autosave directory. If I recall correctly, Matlab reads the colors and fonts from previous installations, so I don't have to do that.
File systems are volatile. This means that you can't trust the result of one operation to still be valid for the next one, even if it's the next line of code. You can't just say if (some file exists and I have permissions for it) open the file, and you can't say if (some file does not exist) create the file. There is always the possibility that the result of your if condition will change in between the two parts of your code. The operations are distinct: not atomic.
To make matters worse, the nature of the problem means that if you're tempted to make this check, odds are you're already worried or aware that something you don't control is likely to happen to the file. The nature of development environments make this event less likely to happen during your testing and very difficult to reproduce. So not only do you have a bug, but the bug won't show up while testing.
Therefore under normal circumstances the best course of action is to not even try to check if a file or directory exists. Instead, put your development time into handling exceptions from the file system. You have to handle these exceptions anyway, so this is a much better use of your resources. Even though exceptions are slow, checking the existence of a file requires an extra trip to disk, and disk access is much slower. I even have a well-voted answer to this effect in another question.
But I'm having some doubts. In .Net, for example, if that's really always true, the .Exists() methods wouldn't be in the API in the first place. Also consider scenarios where you expect your program to need to the create file. The first example that comes to mind is for a desktop application. This application installs a default user-config file to it's home directory, and the first time each user starts the application it copies this file to that user's application data folder. It expects the file not to exist on that first startup.
So when is it acceptable to check in advance for the existence (or other attributes, like size and permissions) of a file? Is expecting failure rather than success on the first attempt a good enough rule of thumb?
The File.Exists method exists primarily for testing for the existence of a file when you do not intend to open the file. For example testing for the existence of a locking file whose very existence tells you something but whose contents are immaterial.
If you are going to open the file then you will need to handle any exception regardless of the results of any prior calls to File.Exists. So, in general, there is no real value in calling it in these circumstances. Just use the appropriate FileMode enumeration value in your open method and handle any exceptions, as simple as that.
EDIT: Even though this is couched in terms of the .Net API, it is based on the underlying system API. Both Windows and Unix have system calls (i.e. CreateFile) that use the equivalent of the FileMode enumeration. In fact in .Net (or Mono) the FileMode value is just passed through to the underlying system call.
As a general policy, methods like File.Exists, or properties like WeakReference.Alive or SomeConcurrentQueue.Count are not useful as a means of ensuring that a "good" state exists, but can be useful as a means of determining that a "bad" state exists without doing any unnecessary (and possibly counterproductive) work. Such situations may arise in many scenarios involving locks (and files, since they often include locks). Because all routines that need to lock on a set of resources should, whenever practical, always acquire locks on those resources in a consistent order, it may be necessary to acquire a lock on one resource which is expected to exist before acquiring a resource which may or may not exist. In such a scenario, while it's impossible to avoid the possibility that one might lock the first resource, fail to acquire the second, and then release the first lock without having done any useful work with it, checking for the existence of the second resource before acquiring the lock on the first would minimize unnecessary and useless effort.
It depends on your requirements, but one way is to try to obtain an exclusive open file handle, with some sort of retry mechanism. Once you have that handle, it's going to be hard (or impossible) for another process to delete (or move) that file.
I've used code in .NET similiar to the following to obtain an exclusive file handle, where I expect some other process to be possibly writing the file:
FileInfo fi = new FileInfo(fullFilePath);
int attempts = maxAttempts;
do
{
try
{
// Asking to open for reading with exclusive access...
fs = fi.Open(FileMode.Open, FileAccess.Read, FileShare.None);
}
// Ignore any errors...
catch {}
if (fs != null)
{
break;
}
else
{
Thread.Sleep(100);
}
}
while (--attempts > 0);
One example: You may be able to check for existence of files which you are unable to open (due to, for example, permissions).
Another, possibly better example: You want to check for the existence of a Unix device file. But definitely do not open it; opening it has side effects (e.g., open/close /dev/st0 will rewind the tape)
In *nix environment a well established method for checking if another copy of the program is already running is to create a lock file. So the check for file existence is used to verify this.
I'd only check it if I expect it to be missing (e.g. the application settings) and only if I have to read the file.
If I have to write to the file, it's either a logfile (so I can just append to it or create a new one) or I replace the contents of it, so I might as well recreate it anyway.
If I expect that the file exists, it would be right that an Exception is thrown. Exception handling should then inform the user or perform recovery. My opinion is that this results in cleaner code.
File protection (i.e. not overwriting (possibly important) files) is different, in that case I'd always check whether a file exists, if the framework doesn't do that for me (think SaveFileDialog)
I think the check makes sense when you want to be sure the file was there in the first place. As you said settings files...if there is a file I will try and merge the existing settings instead of blowing them away.
Other cases would be when a user tells me to do something with a file. Yes I know the openFileDialog will check if a file exists (But this is optional). I vaguely remeber back in VB6 this was not the case, so verifying the file existed that they just told me to use was common.
I'd rather not program by exception.
Edit
I didn't miss the point. You might try and access the file, an exception is thrown and then when you go to create the file, the file was already placed there. Which now causes your exception handling code to go on the fritz. So I guess we could then have an exception handler in our exception handler to catch that the file changed yet again...
I'd rather try and prevent exceptions, not use them to control logic.
Edit
Additionally another time to check for attributes such as size is when your waiting for a file operation to finish, yes you never know for sure but with a good algorithim and depending on the system writting the file you might be able to handle a good deal of cases (Had a system running for five years which watched for small files coming over ftp, and it uses a the same api as the file system watcher, and then starts polling waiting for the file to stop changing, before raising an event that the file is ready to be consumed).
This may be too simplistic, but I would think the primary reason for checking for the existence of a file (hence the existence of .Exists()) would be to prevent unintended overwrites of existing files, not to avoid exceptions caused by attempting to access non-existent nor non-accessible files.
EDIT 2
This was, in fact, too simplistic and I recommend you see Stephen Martin's response.
If you're that concerned about somebody else removing the file, perhaps you should implement some sort of locking system. For instance, I used to work on the code for C-News, a Usenet news server. Since a lot of the things it did could happen asynchronously, it would "lock" a file or a directory by making a temp file, and then hard linking it to a file named "LOCK". If the link failed, it would mean that some other version of the program was writing to that directory, otherwise it was yours and you could do what you like.
The nifty thing about this is that most of the program was written in shell and awk, and this was a very portable locking mechanism. Also, the lock file would contain the PID of the owner, so you could look at the existing lock file to see if the owner was still running.
We have a diagnostic tool that has to gather a set of files, installer log included. Depending on different conditions the installer log can be in one of two folders. Even worse, there can be different versions of the log in both of these folders. How does the tool find the right one?
It's quite simple if you check for existence. If only one is present, grab that file. If two exist, find which has the latest modification time and grab that file. That's just normal way of doing things.
While this is a language-agnostic post, it seems you are talking about .NET. Most systems (.NET and others) have more detailed APIs in order to figure out if the file exists when opening the file.
What you should do is make a call to access the file, as it will typically indicate through some sort of error that the file doesn't exist (if it truly doesn't). In .NET, you would have to go through the P/Invoke layer and use the CreateFile API function. If that function returns an error of ERROR_FILE_NOT_FOUND, then you know that the file does not exist. If it returns successfully, then you have a handle that you can use.
The point here is that it is a somewhat atomic operation, which ultimately is what you are looking for.
Then, with the handle, you can pass it to a FileStream constructor and perform your work on the file.
There are a numbers of possible applications you may well be writing that a simple File.Exists is more than adequate for the job. If it's a config file that only your application will use then you do not need to go so overkill in your exception handling.
Whilst the "flaws" you have pointed out in using this method are all valid, it doesn't mean they are not acceptable flaws for some situations.
A variety of apps include built-in web servers. It's common for them to generate self-signed SSL certificates the first time they start up. A straightforward way to implement this would be to check whether the cert exists on startup, and create it if not.
In theory, it could exist for the check, and not exist later. In that case, we'd get an error when we try to listen, but that can be handled quite easily and is not a big deal.
It's also possible that it doesn't exist for the check, and exists later. In that case, it either gets overwritten with a new cert, or writing the new cert fails, depending on your policy. The first is a little annoying, in terms of the cert change causing some alarm, but also not really critical, especially if you do a bit of logging to indicate what is going on.
And, in practice, both cases are extraordinarily unlikely to ever come up.
Like you pointed out its always important what the program should do if the file is missing. In all my applications the user can always delete the config file and the application will create a new one with default values. No Problem. I also ship my applications without config files.
But users tend to delete files and even files they should not delete like serial keys and template files. I always check for these files because without them the application is unable to run at all. I can not create a new serial key from default.
Whats should happen when the file is missing? You can do a file find or exception handler but the real question is : What will happen when the file is missing? Or how important is the file for the application. I check all the time before I try to access any support files for the app. Additional I do error handling if the file is corrupt and can not be loaded.
I think anytime that you know that the file may or may not exist and you want to perform some alternate action based on the existence of the file, you should do the check because in this case it's not an exceptional condition for the file to not exist. This won't absolve you from having to handle exceptions -- from someone else either removing or creating the file between the check and your open -- but it makes the intent of the program clear and doesn't rely on exception handling to perform flow-control logic.
EDIT: An example might be log rotation on start up.
try
{
if (File.Exists("app.log"))
{
RotateLogs();
}
log = File.Open("app.log", FileMode.CreateNew );
}
catch (IOException)
{
...another writer, perhaps?
}
catch (UnauthorizedAccessException)
{
...maybe I should have used runas?
}
To answer my own question (in part), I want to expand on the example I used: a default config file.
Rather than check if it exists at app startup and try to copy the file if the check fails, the thing to do is always try to copy the file. You just do it in such a way that the copy will fail if the file exists rather than replace an existing file. This way all you need to do is catch and ignore any exception thrown if the copy fails because of an existing file.
Your problem could easily be solved with basic computer science... read up on Semaphores.
(I did not mean to sound like a jerk, I was just pointing you to a simple answer for a common problem).
I think the reason for "Exists" is to determine when files are missing without the need for creating all the OS housekeeping data required to access the file or having exceptions being thrown. So it's a file handling optimisation more than anything else.
For a single file, the saving the "Exists" gives is generally insignificant. If you were checking if a file exists many, many times (for example, searching for #include files) then the saving could be significant.
In .Net, the specification for File.Exists doesn't list any exceptions that the method might throw, unlike for example File.Open which lists nine exceptions, so there's certainly less checking going on in the former.
Even if "Exists" returns true, you still need to handle exceptions when opening the file, as the .Net reference suggests.