What is an efficient way for logging in an existing system - exception

I have the following in my system:
4 File folders
5 Applications that do some processing on files in the folders and then move files to the next folder (processing: read files, update db..)
The process is defined by Stages: 1,2,3,4,5.
As the files are moved along, the Stage field within them is updated to the next Stage.
Sometimes there are exceptions in the system, not necessarily exception in code but exception in the process.
For instance, there is an error in transmitting the file to the next folder. In this case the stage is not updated and an record is written in the DB for this file.
What I want to do, what is the best approach?
I want to plug a utility of some sort or add code to the applications that will capture any exceptions in the process. Like if a file was not moved, I want to know what stage and why. This will help in figuring out the break down in the process.
I need something that will provide the overall health of the process.
Now sure how to go about doing this from an architectural point of view.

The scheduler? Well that might knock the idea out anyway.
Exit code is still up and running from dos days.
it's a property of the Application Class (0 the default) is success
So from your app you'd detect an error and set ApplicationExitCode to some meaning number like 1703 (boo hoo)
Application.ShutDown(1703);// is the .net4 way
However seeing as presumably the scheduler is just running the app, you'd have to script it all up. Might as well just write a common logging dll and add it to each app as mess about with that, especially if you want the same behaviour if it's run from outside the scheduler.
Another option would be delegating. ie you write an app that runs the app (passed in as a command line parameter) and logs the result (via exit code for instance) and then change scheduler items to call that with the requisite parameter.

Related

Run Perl in Browser with PerlTray

I am using perl tray from activestate and have a question. I am wanting to make some type of ui or way for a user to set "Settings" on my application. These settings can just be written / read from a text file that is stored on the users computer.
The part I am not understanding though is how to go about making a ui. The only thing i can think of is showing a local perl page that runs on their computer to write to the file. However, I'm not sure how i could get perl to run in the browser when only using perltray.
Any suggestions?
PerlTray is an odd duck. It has an implicit event loop that kicks in after you either fall off the end of your program or after your 1st call to exit(). This makes it incompatible with most other common GUI event loops or most mini-server techniques that operate in the same process & thread.
2 possibilities come to mind:
Most Likely you'll have success spawning a thread or process that creates a traditional perl GUI or a mini-server hosting your configuration web-app. I'd probably pick Tkx, but that's just my preference.
I have a suspicion that the Event Loop used by Win32::GUI may actually be compatible with the event loop in PerlTray, but some experimentation would be required to verify that. I generally avoid Win32::GUI because it's not platform independent, but if you're using PerlTray, you're tied to Windows anyway...

What's debug section in IDA Pro?

I try to analyze a dll file with my poor assembly skills, so forgive me if I couldn't achieve something very trivial. My problem is that, while debugging the application, I find the code I'm looking for only in debug session, after I stop the debugger, the address is gone. The dll doesn't look to be obfuscated, as many of the code is readable. Take a look at the screenshot. The code I'm looking for is located at address 07D1EBBF in debug376 section. BTW, where did I get this debug376 section?
So my question is, How can I find this function while not debugging?
Thanks
UPDATE
Ok, as I said, as soon as I stop the debugger, the code is vanished. I can't even find it via sequence of bytes (but I can in debug mode). When I start the debugger, the code is not disassembled imediately, I should add a hardware breakpoint at that place and only when the breakpoint will be hit, IDA will show disassembled code. take a look at this screenshot
You see the line of code I'm interested in, which is not visible if the program is not running in debug mode. I'm not sure, but I think it's something like unpacking the code at runtime, which is not visible at design time.
Anyway, any help would be appreciated. I want to know why that code is hidden, until breakpoint hit (it's shown as "db 8Bh" etc) and how to find that address without debugging if possible. BTW, could this be a code from a different module (dll)?
Thanks
UPDATE 2
I found out that debug376 is a segment created at runtime. So simple question: how can I find out where this segment came from :)
So you see the code in the Debugger Window once your program is running and as you seem not to find the verry same opcodes in the raw Hex-Dump once it's not running any more?
What might help you is taking a Memory Snapshot. Pause the program's execution near the instructions you're interested in to make sure they are there, then choose "Take memory snapshot" from the "Debugger" Menu. IDA will then ask you wether to copy only the Data found at the segments that are defined as "loder segments" (those the PE loader creates from the predefined table) or "all segments" that seem to currently belong to the debugged program (including such that might have been created by an unpacking routine, decryptor, whatever). Go for "All segments" and you should be fine seeing memory contents including your debug segments (a segment
created or recognized while debugging) in IDA when not debugging the application.
You can view the list of segements at any time by pressing Shift+F7 or by clicking "Segments" from View > Open subviews.
Keep in mind that the programm your trying to analyze might choose to create the segment some other place the next time it is loaded to make it harder to understand for you what's going on.
UPDATE to match your second Question
When a program is unpacking data from somewhere, it will have to copy stuff somewhere. Windows is a virtual machine that nowadays get's real nasty at you when trying to execute or write code at locations that you're not allowed to. So any program, as long as we're under windows will somehow
Register a Bunch of new memory or overwrite memory it already owns. This is usually done by calling something like malloc or so [Your code looks as if it could have been a verry pointer-intensive language... VB perhaps or something object oriented] it mostly boils down to a call to VirtualAlloc or VirtualAllocEx from Windows's kernel32.dll, see http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887(v=vs.85).aspx for more detail on it's calling convention.
Perhaps set up Windows Exception handling on that and mark the memory range als executable if it wasn't already when calling VirtualAlloc. This would be done by calling VirtualProtect, again from kernel32.dll. See http://msdn.microsoft.com/en-us/library/windows/desktop/aa366898(v=vs.85).aspx and http://msdn.microsoft.com/en-us/library/windows/desktop/aa366786(v=vs.85).aspx for more info on that.
So now, you should take a step trough the programm, starting at its default Entrypoint (OEP) and look for calls tho one of those functions, possibly with the memory protection set to PAGE_EXECUTE or a descendant. After that will possibly come some sort of loop decrypting the memory contents, copying them to their new location. You might want to just step over it, depending on what your interest in the program is by justr placing the cursor after the loop (thick blue line in IDA usually) and clicking "Run to Cursor" from the menu that appears upon right clicking the assembler code.
If that fails, just try placing a Hardware Breakpoint on kernel32.dll's VirtualAlloc and see if you get anything interestin when stepping into the return statement so you end up wherever the execution chain will take you after the Alloc or Protect call.
You need to find the Relative Virtual Address of that code, this will allow you to find it again regardless of the load address (pretty handy with almost all systems using ASLR these days). the RVA is generally calculated as virtual address - base load address = RVA, however, you might also need to account for the section base as well.
The alternative is to use IDA's rebasing tool to rebase the dll to the same address everytime.

Autoupdate ala Google Chrome workflow

In the company I am I was asked to write an autoupdate function a la chrome. I.e. It should check periodically whether a new version is available, download the new version and apply it silently the next time the application starts.
I already have something up and running but it is more like a dirty hack than something I feel happy about it. So, I would like to know how to design and implement such a solution. My horrible hack works as this:
Have a mechanism to check whether a new version exists (a database query or a web service)
Download a full zip with the whole new version.
Check file signature. If everything went alright, set a registry value: must update to true.
When the application restarts, if the must update value is true, launch an update program and exist.
The update deletes the contents of the application folder, unzips the update and replaces the old contents, launches the application and exits.
Now, I would like to change it, so it works cleaner. I am planning to send the update as a bsdiff file. It gets downloaded. But the question is, what happens next?
When do apply the update?
Who is in charge of applying the patch? is it the program itself or is it a third program, as I did, which is in charge of applying the patch and relaunch the application?
If your going down the C++ route you can go to chromium and download the Chrome source code and dig around to see how the update is done, this might give you a better idea on how to approach it. Here's an article that might help.
If your familiar with .NET the recently release nuget also has an auto update feature that might be useful to look at, you can get the source code from here. David Ebbo has a blog about how its done here.
I'm not up to date on Delphi but you might be able to use either of the above options.
The workflow you proposed is more or less like it should work, but there's no need to re-invent the wheel - there are plenty libraries out there that will do this for you. Using a 3rd party library has the benefit of keeping your code cleaner while making sure the dirty process of auto-update is contained and working flawlessly.
Trust me, I know. I'm the author of NAppUpdate, an app update framework for .NET (which you might want to try out or learn from).
So, after giving it a lot of though, this is what I came with (for active directory I will refer to the directory where the main program lies, active program is the main program and update program is the one that replaces the active program and its resource files):
The active program checks if there is a new version every certain amount of time. If so, download it
Prepare new version in a separate folder (this can be done by copying the contents of the directory with the program to a subdirectory and applying a binary patch, or simply unziping the new version).
Set a flag that indicates that a new version is ready.
When a program is exiting (and one has to control for different interrupts here):
The active program checks the new version ready flag. Launch the update program and exit.
The update program checks if it can write in the active directory. If so, replaces the contents with the prepared version.
The update program has to recheck links and update them accordingly.
So guys, if you have a better workflow, please tell me.
You could literally use the Google Chrome update workflow by using the Google Chrome updater:
http://code.google.com/p/omaha/
They open sourced it Feb 2009.

Can any linux API or tool watch for any change in any folder below e.g. /SharedRoot or do I have to setup e.g. inotify for each folder?

I have a folder with ~10 000 subfolders.
Can any linux API or tool watch for any change in any folder below e.g. /SharedRoot or do I have to setup inotify for each folder? (i.e. I loose if I want to do this for 10k+ folders). I guess yes, since I've already seen examples of this inefficient method, for instance http://twistedmatrix.com/trac/browser/trunk/twisted/internet/inotify.py?rev=28866#L345
My problem:
I need to keep folders time-sorted with most recently active "project" up top.
When a file changes, each folder above that file should update its last-modified timestamp to match the file. Delays are ok. Opening a file (typically MS Excel) and closing again, its file date can jump up and then down again. For this reason I need to wait until after a file is closed, then queue the folder of that file for checking, and only a while later do I go and look for the newest file in its folder, since the filedate of the triggering file could already be back-dated to its original timestamp by Excel or similar programs. Also in case several files from same folder are used/created, it makes sense to buffer timestamping of that folders' parents to at least get a bunch of updates collapsed into one delayed update.
I'm looking for a linux solution. I have some code that can be run on a windows server, most of the queing functionality is here: http://github.com/sesam/FolderdateFollowsFiles/blob/master/FolderdateFollowsFiles/Follower.vb
Available API:s
The relative of inotify on windows, ReadDirectoryChangesW, can watch a folder and its whole subtree; see bWatchSubtree on http://msdn.microsoft.com/en-us/library/aa365465(VS.85).aspx
Samba?
Patching samba source is a possibility, but perhaps there are already hooks available? Other possibilities, like client side (various windows versions) and spying on file activities in order to update folders recursively?
Yes, you need to use inotify, however you need not consume watches on every node immediately.
The process (similar to how beagle does it) is rather simple:
Establish a watch on the root node.
Do a breadth first (not depth first) search starting at the root node
Establish watches on directories, in the order of the search.
Watch for directory create events, continue adding as they do. Re-sort your list as this happens.
The breadth first search is important, otherwise you might miss some stuff due to a race of when you start and what clients of the root node are doing.
See this question, which also mentions this RFQ. I had the same exact problem that you are facing.
In essence, one thread continues to watch for directory create events, adding new watches on new directories almost at the same time that they are created. Something else sorts the list either on demand, or after the inotify thread releases its lock.
I've attempted lock-free versions of the above, but with .. questionable .. success :)
I saw you are running these trees under a Samba share. Maybe you can use the ClamAV virus scanning VFS module for inspiration to see how they trigger the 'scan on close'.
Samba Howto : Stackable VFS Modules
It should be pretty straightforward to check the time of the closed file and modify the directory path leading to it without any of the performance/memory overhead associated with inotify et al.
Just a thought.

When is it okay to check if a file exists?

File systems are volatile. This means that you can't trust the result of one operation to still be valid for the next one, even if it's the next line of code. You can't just say if (some file exists and I have permissions for it) open the file, and you can't say if (some file does not exist) create the file. There is always the possibility that the result of your if condition will change in between the two parts of your code. The operations are distinct: not atomic.
To make matters worse, the nature of the problem means that if you're tempted to make this check, odds are you're already worried or aware that something you don't control is likely to happen to the file. The nature of development environments make this event less likely to happen during your testing and very difficult to reproduce. So not only do you have a bug, but the bug won't show up while testing.
Therefore under normal circumstances the best course of action is to not even try to check if a file or directory exists. Instead, put your development time into handling exceptions from the file system. You have to handle these exceptions anyway, so this is a much better use of your resources. Even though exceptions are slow, checking the existence of a file requires an extra trip to disk, and disk access is much slower. I even have a well-voted answer to this effect in another question.
But I'm having some doubts. In .Net, for example, if that's really always true, the .Exists() methods wouldn't be in the API in the first place. Also consider scenarios where you expect your program to need to the create file. The first example that comes to mind is for a desktop application. This application installs a default user-config file to it's home directory, and the first time each user starts the application it copies this file to that user's application data folder. It expects the file not to exist on that first startup.
So when is it acceptable to check in advance for the existence (or other attributes, like size and permissions) of a file? Is expecting failure rather than success on the first attempt a good enough rule of thumb?
The File.Exists method exists primarily for testing for the existence of a file when you do not intend to open the file. For example testing for the existence of a locking file whose very existence tells you something but whose contents are immaterial.
If you are going to open the file then you will need to handle any exception regardless of the results of any prior calls to File.Exists. So, in general, there is no real value in calling it in these circumstances. Just use the appropriate FileMode enumeration value in your open method and handle any exceptions, as simple as that.
EDIT: Even though this is couched in terms of the .Net API, it is based on the underlying system API. Both Windows and Unix have system calls (i.e. CreateFile) that use the equivalent of the FileMode enumeration. In fact in .Net (or Mono) the FileMode value is just passed through to the underlying system call.
As a general policy, methods like File.Exists, or properties like WeakReference.Alive or SomeConcurrentQueue.Count are not useful as a means of ensuring that a "good" state exists, but can be useful as a means of determining that a "bad" state exists without doing any unnecessary (and possibly counterproductive) work. Such situations may arise in many scenarios involving locks (and files, since they often include locks). Because all routines that need to lock on a set of resources should, whenever practical, always acquire locks on those resources in a consistent order, it may be necessary to acquire a lock on one resource which is expected to exist before acquiring a resource which may or may not exist. In such a scenario, while it's impossible to avoid the possibility that one might lock the first resource, fail to acquire the second, and then release the first lock without having done any useful work with it, checking for the existence of the second resource before acquiring the lock on the first would minimize unnecessary and useless effort.
It depends on your requirements, but one way is to try to obtain an exclusive open file handle, with some sort of retry mechanism. Once you have that handle, it's going to be hard (or impossible) for another process to delete (or move) that file.
I've used code in .NET similiar to the following to obtain an exclusive file handle, where I expect some other process to be possibly writing the file:
FileInfo fi = new FileInfo(fullFilePath);
int attempts = maxAttempts;
do
{
try
{
// Asking to open for reading with exclusive access...
fs = fi.Open(FileMode.Open, FileAccess.Read, FileShare.None);
}
// Ignore any errors...
catch {}
if (fs != null)
{
break;
}
else
{
Thread.Sleep(100);
}
}
while (--attempts > 0);
One example: You may be able to check for existence of files which you are unable to open (due to, for example, permissions).
Another, possibly better example: You want to check for the existence of a Unix device file. But definitely do not open it; opening it has side effects (e.g., open/close /dev/st0 will rewind the tape)
In *nix environment a well established method for checking if another copy of the program is already running is to create a lock file. So the check for file existence is used to verify this.
I'd only check it if I expect it to be missing (e.g. the application settings) and only if I have to read the file.
If I have to write to the file, it's either a logfile (so I can just append to it or create a new one) or I replace the contents of it, so I might as well recreate it anyway.
If I expect that the file exists, it would be right that an Exception is thrown. Exception handling should then inform the user or perform recovery. My opinion is that this results in cleaner code.
File protection (i.e. not overwriting (possibly important) files) is different, in that case I'd always check whether a file exists, if the framework doesn't do that for me (think SaveFileDialog)
I think the check makes sense when you want to be sure the file was there in the first place. As you said settings files...if there is a file I will try and merge the existing settings instead of blowing them away.
Other cases would be when a user tells me to do something with a file. Yes I know the openFileDialog will check if a file exists (But this is optional). I vaguely remeber back in VB6 this was not the case, so verifying the file existed that they just told me to use was common.
I'd rather not program by exception.
Edit
I didn't miss the point. You might try and access the file, an exception is thrown and then when you go to create the file, the file was already placed there. Which now causes your exception handling code to go on the fritz. So I guess we could then have an exception handler in our exception handler to catch that the file changed yet again...
I'd rather try and prevent exceptions, not use them to control logic.
Edit
Additionally another time to check for attributes such as size is when your waiting for a file operation to finish, yes you never know for sure but with a good algorithim and depending on the system writting the file you might be able to handle a good deal of cases (Had a system running for five years which watched for small files coming over ftp, and it uses a the same api as the file system watcher, and then starts polling waiting for the file to stop changing, before raising an event that the file is ready to be consumed).
This may be too simplistic, but I would think the primary reason for checking for the existence of a file (hence the existence of .Exists()) would be to prevent unintended overwrites of existing files, not to avoid exceptions caused by attempting to access non-existent nor non-accessible files.
EDIT 2
This was, in fact, too simplistic and I recommend you see Stephen Martin's response.
If you're that concerned about somebody else removing the file, perhaps you should implement some sort of locking system. For instance, I used to work on the code for C-News, a Usenet news server. Since a lot of the things it did could happen asynchronously, it would "lock" a file or a directory by making a temp file, and then hard linking it to a file named "LOCK". If the link failed, it would mean that some other version of the program was writing to that directory, otherwise it was yours and you could do what you like.
The nifty thing about this is that most of the program was written in shell and awk, and this was a very portable locking mechanism. Also, the lock file would contain the PID of the owner, so you could look at the existing lock file to see if the owner was still running.
We have a diagnostic tool that has to gather a set of files, installer log included. Depending on different conditions the installer log can be in one of two folders. Even worse, there can be different versions of the log in both of these folders. How does the tool find the right one?
It's quite simple if you check for existence. If only one is present, grab that file. If two exist, find which has the latest modification time and grab that file. That's just normal way of doing things.
While this is a language-agnostic post, it seems you are talking about .NET. Most systems (.NET and others) have more detailed APIs in order to figure out if the file exists when opening the file.
What you should do is make a call to access the file, as it will typically indicate through some sort of error that the file doesn't exist (if it truly doesn't). In .NET, you would have to go through the P/Invoke layer and use the CreateFile API function. If that function returns an error of ERROR_FILE_NOT_FOUND, then you know that the file does not exist. If it returns successfully, then you have a handle that you can use.
The point here is that it is a somewhat atomic operation, which ultimately is what you are looking for.
Then, with the handle, you can pass it to a FileStream constructor and perform your work on the file.
There are a numbers of possible applications you may well be writing that a simple File.Exists is more than adequate for the job. If it's a config file that only your application will use then you do not need to go so overkill in your exception handling.
Whilst the "flaws" you have pointed out in using this method are all valid, it doesn't mean they are not acceptable flaws for some situations.
A variety of apps include built-in web servers. It's common for them to generate self-signed SSL certificates the first time they start up. A straightforward way to implement this would be to check whether the cert exists on startup, and create it if not.
In theory, it could exist for the check, and not exist later. In that case, we'd get an error when we try to listen, but that can be handled quite easily and is not a big deal.
It's also possible that it doesn't exist for the check, and exists later. In that case, it either gets overwritten with a new cert, or writing the new cert fails, depending on your policy. The first is a little annoying, in terms of the cert change causing some alarm, but also not really critical, especially if you do a bit of logging to indicate what is going on.
And, in practice, both cases are extraordinarily unlikely to ever come up.
Like you pointed out its always important what the program should do if the file is missing. In all my applications the user can always delete the config file and the application will create a new one with default values. No Problem. I also ship my applications without config files.
But users tend to delete files and even files they should not delete like serial keys and template files. I always check for these files because without them the application is unable to run at all. I can not create a new serial key from default.
Whats should happen when the file is missing? You can do a file find or exception handler but the real question is : What will happen when the file is missing? Or how important is the file for the application. I check all the time before I try to access any support files for the app. Additional I do error handling if the file is corrupt and can not be loaded.
I think anytime that you know that the file may or may not exist and you want to perform some alternate action based on the existence of the file, you should do the check because in this case it's not an exceptional condition for the file to not exist. This won't absolve you from having to handle exceptions -- from someone else either removing or creating the file between the check and your open -- but it makes the intent of the program clear and doesn't rely on exception handling to perform flow-control logic.
EDIT: An example might be log rotation on start up.
try
{
if (File.Exists("app.log"))
{
RotateLogs();
}
log = File.Open("app.log", FileMode.CreateNew );
}
catch (IOException)
{
...another writer, perhaps?
}
catch (UnauthorizedAccessException)
{
...maybe I should have used runas?
}
To answer my own question (in part), I want to expand on the example I used: a default config file.
Rather than check if it exists at app startup and try to copy the file if the check fails, the thing to do is always try to copy the file. You just do it in such a way that the copy will fail if the file exists rather than replace an existing file. This way all you need to do is catch and ignore any exception thrown if the copy fails because of an existing file.
Your problem could easily be solved with basic computer science... read up on Semaphores.
(I did not mean to sound like a jerk, I was just pointing you to a simple answer for a common problem).
I think the reason for "Exists" is to determine when files are missing without the need for creating all the OS housekeeping data required to access the file or having exceptions being thrown. So it's a file handling optimisation more than anything else.
For a single file, the saving the "Exists" gives is generally insignificant. If you were checking if a file exists many, many times (for example, searching for #include files) then the saving could be significant.
In .Net, the specification for File.Exists doesn't list any exceptions that the method might throw, unlike for example File.Open which lists nine exceptions, so there's certainly less checking going on in the former.
Even if "Exists" returns true, you still need to handle exceptions when opening the file, as the .Net reference suggests.