How to read target extent from a huge TIFF file by LibTiff.Net library - tiff

I have a big tiff file which I don't want to load it into memory one time (That will cause my application takes so many memory), I want to load target part of it one time and show this part in screen.
I am trying to use LibTiff.net library for implement this, but I haven't found a suitable API for it.
Currently I can just load that by calloc a new array (very big!) then call ReadRGBAImageOriented function for load the RGBA value for it.
Do someone have experience on it?
Thanks

Related

Why can't I read a JSON file with a different program while my Processing sketch is still open?

I'm writing data to a JSON file in Processing with the saveJSONObject command. I would like to access that JSON file with another program (MAX/MSP) while my sketch is still open. The problem is, MAX is unable to read from the file while my sketch is running. Only after I close the sketch is MAX able to import data from my file.
Is Processing keeping that file open somehow while the sketch is running? Is there any way I can get around this problem?
It might be easier to stream your data straight to MaxMSP using the OSC protocol. On the Processing side, have a look at the oscP5 library and on the Max side at the udpreceive object.
You could send your JSON object as a string and unpack that in Max (maybe using the JavaScript support already present in Max), but it might be simpler to mimic the structure of your JSON object as the arguments of the OSC message object which you simply umpack in Max directly.
Probably, because I/O is usually buffered (notably for performance reasons, ans also because the hardware is doing I/O by blocks).
Try to flush the output channel, perhaps using PrintWriter::flush or something similar.
Details are implementation specific (and might be operating system specific).

Better way to bypass Haxe 16MB embedding file size limit?

I just noticed that Haxe (openFL) limits a single embed file's size to 16MB when using the #:file tag(or openfl.Assets). Flash/Flex can embed much larger files directly. My way to solve the problem is to split one large file to several smaller files and combine them at run time. But this is not so convenient sometimes. So is there any better way to bypass this limit in Haxe?
First of all, embedding files of this big size is generally not a good idea:
binary size is huge
is slows down compilation(because you need to copypaste quite a big amount of data every time
this data is forced to be stored in RAM/swap while application is running
But, speaking of solving the exact problem... I'm not exactly sure if swf allows to embed this big chunks of data at all(need to look at bytecode specs), but in any case it seems that the limitation is there because of ocaml inner limitation on string size. It can be fixed, I believe, however you need to rewrite part of haxe swf generator.
If you don't want to fix compiler(which may not be possible in case swf doesn't allow to embed this big chunks), then you may just go with a simple macro, which will transparently slice your file in parts, embed and reassemble in runtime.

Load Compressed csv data in d3 for page optimization

I am having a 50MB csv data, is there any possibility i can compress the data to load
d3.js/ dc.js charts, now the page is too slow i would like to optimise it.. any help is much appreciated
Thanks in advance
I think it would be the best to implement a lazy loading solution. The idea is simple: you create a small, say 2MB CSV file and render your visualization using it. At the same time you start loading your full 50MB CSV.
Here is a small snippet:
DS = {} // your app holder for keeping global scope clean
d3.csv('data/small.csv', function(err, smallCSV) {
// Start loading big file immediately
d3.csv('data/big.csv', function(err, bigCSV) {
DS.data = bigCSV // when big data is loaded it replaces old partial data
DS.drawViz() // redraw viz
})
// This portion of code also starts immediately, while big file is still loading
DS.data = smallCSV
DS.drawViz() // function which has all your d3 code and uses DS.data inside
})
The change from small to big could be done in such way that user would have no clue, that something happened in the background. Consider this example where quite big data file is loaded and you can feel the lag at start. This app could load much faster if data would be loaded in two rounds.
That's a lot of data; give us a sample of the first couple rows. What are you doing with it, and how much of it affects what's on screen? Where does the csv come from (i.e., local or web service)?
If it's a matter of downloading the resource, depending on how common and large the values are, you may be able to refactor them into 1-byte keys with definitions pre-loaded (hash maps are O(1) access). Also if you're using a large amount of numerical data, perhaps a different number space (i.e., something that uses less characters than base 10) can shave some bytes off the final size since the CSV values are strings.
It sounds like CSV may not be the way to go, though, especially if your CSV is mostly unique strings or certain numerical data that won't benefit from the above optimizations. If you're loading the CSV from a web service, you could change it so that certain chunks are returned via some passed key (or handle it smarter server-side). So you would load only what you need at any given time, and probably cache it.
Finally, you could schedule multiple async calls to load the the whole thing in small chunks similar to what was suggested by leakyMirror. Since it would probably make most sense to use a lot of chunks, you'd want to do it with code (instead of typing all of those callbacks), and use an async event scheduler. I know there's a popular async library (https://github.com/caolan/async) that has a bunch of ways to do this, or you can write your own callback scheduler.

HTML5: accessing large structured local data

Summary:
Are there good HTML5/javascript options for selectively reading chunks of data (let's say to be eventually converted to JSON) from a large local file?
Problem I am trying to solve:
Some existing program locally and outputs a ton of data. I want to provide a browser-based interactive viewer that will allow folks to browse through these results. I have control over how the data is written out. I can write it all out in one big file, but since it's quite large, I can't just read the whole thing in memory. Hence, I am looking for some kind of indexed or db-like access to this from my webapp.
Thoughts on solutions:
1. Brute-force: HTML5 FileReader API has a nice slice() method for random access. So I could write out some kind of an index in the beginning of the file, use it to look up positions of other stored objects, and read them whenever they're needed. I figured I'd ask if there are already javascript libraries that do something like this (or better) before trying to implement this ugly thing.
2. HTML5 local database. Essentially, I am looking for an analog of HTML5 openDatabase() call that would open (a read-only) connection to a database based on a user-specified local file. From what I understand, there's no way to specify a file with a pre-loaded database. Furthermore, even if there was such a hack, it's not clear whether the local file format would be the same across browsers. I've seen the phonegap solution that populates the browser local database from SQL statements. I can do that too, but the data I am talking about is quite large (5-10GB): it will take a while to load, and such duplication seems rather pointless.
HTML5 does not sound like the appropriate answer for your needs. HTML5's focus is on the client side, and based on your description you're asking a lot out of the browsers, most likely more than they can handle.
I would instead recommend you look at a server-based solution to deliver the desired goal/results to the client view, something like Splunk would be a good product to consider.

Reverse engineering a custom data file

At my place of work we have a legacy document management system that for various reasons is now unsupported by the developers. I have been asked to look into extracting the documents contained in this system to eventually be imported into a new 3rd party system.
From tracing and process monitoring I have determined that the document images (mainly tiff files) are stored in a number of 1.5GB files. These files seem to be read from a specific offset and then written to a tmp file that is then served via a web app to the client, and then deleted.
I guess I am looking for suggestions as to how I can inspect these large files that contain the tiff images, and eventually extract and write them to individual files.
Are the TIFFs compressed in some way? If not, then your job may be pretty easy: stitch the TIFFs together from the 1.5G files.
Can you see the output of a particular 1.5G file (or series of them)? If so, then you should be able to piece together what the bytes should look like for that TIFF if it were uncompressed.
If the bytes don't appear to be there, then try some standard compressions (zip, tar, etc.) to see if you get a match.
I'd open a file, seek to the required offset, and then stream into a tiff object (ideally one that supports streaming from memory or file). Then you've got it. Poke around at some of the other bits, as there's likely metadata about the document that may be useful to the next system.