Watch file(s) for modifications algorithm - language-agnostic

I was simply wondering how file watching algorithms are implemented. For instance, let's say I want to apply a filter (i.e., search/replace a string) to a file every time it is modified, what technique should I use? Obviously, I could run an infinite loop that would check every file in a directory for modifications, but it might not be very efficient. Is there any way to get notified directly by the OS instead? For the sake of demonstration, let's assume a *nix OS and whatever language (C/Ruby/Python/Java/etc.).

Linux has inotify, and judging from the wikipedia links, Windows has something similar called 'Directory Management'. Without something like inotify, you can only poll..

In Linux there is the Inotify subsystem which will alert you to file modification.

JavaSE 7 will have File Change Notification as part of NIO.2 updates.

There are wrappers to inotify that make it easy to use from high-level languages. For example, in ruby you can do the following with rb-inotify:
notifier = INotify::Notifier.new
# tell it what to watch
notifier.watch("path/to/foo.txt", :modify) {puts "foo.txt was modified!"}
notifier.watch("path/to/bar", :moved_to, :create) do |event|
puts "#{event.name} is now in path/to/bar!"
end
There's also pyinotify but I was unable to come up with an example as concise as the above.

Related

Simple html to pdf conversion commandline tool for automated file creation

I have a system that automatically creates and saves documents as html. For further storage they ought to be pdfs though.
I want to avoid having to do it manually so my preferred solution would be a small executable that I can call via command line, feed it with a source and output path (and ideally further parameters) and then let it do its magic. Something in concept like this:
exampleConverter.exe "C:\source\document1.html" "C:\convertedPDFs\document1.pdf"
No UI whatsoever, no human input, no popping up and closing console.
I looked through several options, but common problems I encountered were
the software not being free for commercial use
It just being a library of code, not a ready-to-go executable / code-base you just need to compile into one
The tool needing to get installed instead of being 'portable'
I'd like to avoid having to implement any modern libraries myself, partially for simple time concearns, partially because internally our code runs in a less than modern IE & VBS context so I for see compatibility problems.
Simply triggering a precompiled executable through a generic command line inerface that I can trigger from vbs seems like the perfect solution here.
Your Windows OS program code is almost there, why not reverse input and output (makes the task easier later), with a switch or two. you can embellish that with your for /? loop to run through the current working folder, just like any other program.
Your pseudo code
exampleConverter.exe --print-to-pdf="C:\convertedPDFs\document1.pdf" --headless "C:\source\document1.html"
Working Windows native code
"C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe" --print-to-pdf="%CD%\out\document1.pdf" --headless "%CD%\in\document1.html"
Other options are available
learn.microsoft.com suggest this working snippet to run edge with parameters
wscript vbsEdge.vbs
Dim shell
Set shell = WScript.CreateObject("WScript.Shell")
shell.Run "msedge https://www.google.com --hide-scrollbars --content-shell-hide-toolbar"
So just combine the program methods. However, you need to sort out your own arguments.
For greater control then you need to step-up to heavier custom isations https://blogs.windows.com/msedgedev/2015/07/23/bringing-automated-testing-to-microsoft-edge-through-webdriver/ etc.

nsys says "please use the qdrep file instead" - huh?

I'm using NVIDIA Nsight Systems version 2019.5.2.16-b54ef97 with CUDA 10.2. I'm running:
nsys profile my_app --some --args=here
so, a plain-vanilla profiling with no funny business. And yet, I get, at the bottom of the output:
... etc. etc. ...
Saving report to file "/some/where/report1.qdrep"
Report file saved.
Please discard the qdstrm file and use the qdrep file instead.
Removed /some/where/report1.qdstrm as it was successfully imported.
Please use the qdrep file instead.
Why am I being told to discard files and use other files instead? Especially given how, eventually, only a single file is generated (a .qdrep file)?
I'm guessing some internal conversion utility is run, and the message is not really intended for me - or am I missing something?
It is just a logging, which is a little confusing though, and later it removes the *.qdstrm file for you automatically.

igraph for python

I'm thoroughly confused about how to read/write into igraph's Python module. What I'm trying right now is:
g = igraph.read("football.gml")
g.write_svg("football.svg", g.layout_circle() )
I have a football.gml file, and this code runs and writes a file called football.svg. But when I try to open it using InkScape, I get an error message saying the file cannot be loaded. Is this the correct way to write the code? What could be going wrong?
The write_svg function is sort of deprecated; it was meant only as a quick hack to allow SVG exports from igraph even if you don't have the Cairo module for Python. It has not been maintained for a while so it could be the case that you hit a bug.
If you have the Cairo module for Python (on most Linux systems, you can simply install it from an appropriate package), you can simply do this:
igraph.plot(g, "football.svg", layout="circle")
This would use Cairo's SVG renderer, which is likely to generate the correct result. If you cannot install the Cairo module for Python for some reason, please file a bug report on https://bugs.launchpad.net/igraph so we can look into this.
(Even better, please file a bug report even if you managed to make it work using igraph.plot).
Couple years late, but maybe this will be helpful to somebody.
The write_svg function seems not to escape ampersands correctly. Texas A&M has an ampersand in its label -- InkScape is probably confused because it sees & rather than &. Just open football.svg in a text editor to fix that, and you should be golden!

how to find which libraries to link to? or, how can I create *-config (such as sdl-config, llvm-config)?

I want to write a program that outputs a list of libraries that I should link to given source code (or object) files (for C or C++ programs).
In *nix, there are useful tools such as sdl-config and llvm-config. But, I want my program to work on Windows, too.
Usage:
get-library-names -l /path/to/lib a.cpp b.cpp c.cpp d.obj
Then, get-library-names would get a list of function names that are invoked from a.cpp, b.cpp, c.cpp, and d.obj. And, it'll search all library files in /path/to/lib directory and list libraries that are needed to link properly.
Is there such tool already written? Is it not trivial to write a such tool?
How do you find what libraries you should link to?
Thanks.
Yeah, you can create a pkg-config file which will allow you to run 'pkg-config --cflags' to get the compiler flags or 'pkg-config --libs' to get the linker libraries.
http://pkg-config.freedesktop.org/wiki/
If you're on Linux, just try looking into /usr/lib/pkgconfig to find some example .pc files that you can use as models. You can still use pkg-config on Windows as well, but it's not something that comes with it.

How can I get a Windows batch or Perl script to run when a file is added to a directory?

I am trying to write a script that will parse a local file and upload its contents to a MySQL database. Right now, I am thinking that a batch script that runs a Perl script would work, but am not sure if this is the best method of accomplishing this.
In addition, I would like this script to run immediately when the data file is added to a certain directory. Is this possible in Windows?
Thoughts? Feedback? I'm fairly new to Perl and Windows batch scripts, so any guidance would be appreciated.
You can use Win32::ChangeNotify. Your script will be notified when a file is added to the target directory.
Checking a folder for newly created files can be implemented using the WMI functionality. Namely, you can create a Perl script that subscribes to the __InstanceCreationEvent WMI event that traces the creation of the CIM_DirectoryContainsFile class instances. Once that kind of event is fired, you know a new file has been added to the folder and can process it as you need.
These articles provide more information on the subject and contain VBScript code samples (hope it won't be hard for you to convert them to Perl):
How Can I Automatically Run a Script Any Time a File is Added to a Folder?
WMI and File System Monitoring
The function you want is ReadDirectoryChangesW. A quick search for a perl wrapper yields this Win32::ReadDirectoryChanges module.
Your script would look something like this:
use Win32::ReadDirectoryChanges;
$rdc = new Win32::ReadDirectoryChanges(path => $path,
subtree => 1,
filter => $filter);
while(1) {
#results = $rdc->read_changes;
while (scalar #results) {
my ($action, $filename) = splice(#results, 0, 2);
... run script ...
}
}
You can easily achieve this in Perl using File::ChangeNotify. This module is to be found on CPAN: http://search.cpan.org/dist/File-ChangeNotify/lib/File/ChangeNotify.pm
You can run the code as a daemon or as a service, make it watch one or more directories and then automatically execute some code (or start up a script) if some condition matches.
Best of all, it's cross-platform, so should you want to switch to a Linux machine or a Mac, it would still work.
It wouldn't be too hard to put together a small C# application that uses the FileSystemWatcher class to detect files being added to a folder and then spawn the required script. It would certainly use less CPU / system resources / hard disk bandwidth than polling the folder at regular intervals.
You need to consider what is a sufficient heuristic for determining "modified".
In increasing order of cost and accuracy:
file size (file content can still be changed as long as size is maintained)
file timestamp (If you aren't running ntpd time is not monotonic)
file sha1sum (bulletproof but expensive)
I would run ntpd, and then loop over the timestamps, and then compare the checksum if the timestamp changes. This can cover a lot of ground in little time.
These methods are not appropriate for a computer security application, they are for file management on a sane system.