Algorithm issue with TIFF CCITT Group 4 decompression (T.6)

Algorithm issue with TIFF CCITT Group 4 decompression (T.6) - tiff

I work for an engineering design house and we store black and white design drawings in TIFF format compressed with CCITT Group 4 compression.
I am working on a project to improve our software for working with these drawings. I need to be able to load the raw data into my program obviously, so I must decompress it.
I tried using LibTiff but gave up on that rather quickly. It wouldn't build, generating over 2000 errors. I found many obvious syntax errors in the library and concluded it was junk. I spent about 3 hours trying to find the part of the library that implements the CCITT Group 4 codec but no luck, that code is an incomprehensible mess.
So it is that I am writing my own codec for the program. I have it mostly working well, but I am stuck on a problem. I cannot find good documentation on this format. There are a lot of good overviews that describe generally how 2D Modified Huffman compression works, but I cant find any that have specific, implementation level details. So I am trying to work it out by using some of the drawing files as examples.
I have vertical and pass modes working well and my algorithm decompresses about a third of the image properly before it goes off to the wizard and produces garbage.
I traced the problem to the horizontal mode. My algorithm for the horizontal mode expects to see the horizontal mode code 001 followed by a set of makeup codes (optional) and a termination code in the the current pen color, followed by another set of makeup codes (optional) and a termination code in the opposite color.
This algorithm worked well for a third of the way through the image, but suddenly I encountered a horizontal mode run where the opposite color comes before the current pen color.
The section of the image is a run of 12 black pixels followed by a run of 22 white pixels.
The code bits from that section are 00100000110000111 which decodes to Horizontal (001) 22 White (0000011) 12 Black (0000111 ) which as you can see is opposite of the order in which the pixels appear in the image.
Since my algorithm expects image order listing, it crashes. But the previous 307 instances of horizontal mode in this same image file were all in image order. This is the only reversed one I have found (so far).
Other imaging programs display this file just fine. I tried manually editing the bits in the image file just as a test to put the order in image order and that causes other imaging programs to crash when decoding the image. This leads me to believe they have some way of knowing that it is reversed in that instance.
Anyone know specific implementation level details about this TIFF CCITT G4 encoding which could help me understand how and why the run codes are sometimes reversed?
Thanks
Josh

CCITT G4 horizontal codes are always encoded as a pair (black/white) or (white/black). It depends on the current pen color. A vertical code will flip the color, but a horizontal code will leave the color unchanged. If the current pen color is black, then you decode a white horizontal code followed by a black. If the current pen color is white, then you will do the opposite.

Code : 00100000110000111
001 : Horizontal Mode
0000011000 : Black RunLength 17
0111 : White RunLength 2
It is Black first.
Run codes are not reversed.

Related

Is it possible to grab the 4 numbers from this image using IronOCR?

So me and my friends play a game and they recently changed there images from white background and black letters to black background and colorful letters. and the old ocr that we was created years ago by someone is pretty useless now as the accuracy is very low if not 0% (it just took the old ocr ~250 attempts). So my question would i be able to to extract the text from the following picture
I have never used IronOCR and i tried using the default code to get text from image but the results were weird.
Thanks in advance!

You can try to segment the image first by color (a histogram analysis will tell you colors on the image). Then you can convert the images to b/w and run OCR. You'll get better accuracy.

What is a flag? What is the pygame.SRCALPHA flag used for (I came across that in the example aacircle.py)?

I know almost nothing about pygame, and I'm trying to learn by messing with the examples. Also I'm reading the documentation and might try some tutorials.
I'm coming from a background of javascript with processing.js which I learned on khan academy's computer science section.
Anyways in the documentation for surface objects (page 4 of the pdf I downloaded) it says "For a plain software surface, 0 can be used for the flag." So as one of my first experiments I changed pygame.SRCALPHA in the example aacircle.py to a zero and tested it out. The screen went from bliting a red background with a black circle, to just solid black. Why?

I recommend you check out the pygame documentation on surfaces:
The pixel format can be controlled by passing the bit depth or an
existing Surface. The flags argument is a bitmask of additional
features for the surface. You can pass any combination of these flags:
HWSURFACE, creates the image in video memory
SRCALPHA, the pixel format will include a per-pixel alpha

html5 svg vs canvas for granite like background

i want to make to make a granite like background like http://www.tivli.com/ with a gradient at the center. i have found how to do gradient with both in the w3c tutorials, but are there any tutorials on how to make granite backgrounds in html5 canvas or svg? Thanks.

The site you referenced actually uses a simple 'noize.png' and then uses css3 radial gradients to buildup that background. I know you already knew that, I'm mentioning this for future readers.
Given this fact, I'll assume in the rest of my answer you want to learn, not a copy-pasta solution.
I've given up on svg looong time ago. But in canvas it's easy and fun... (especially now flash is FINALLY officially dead. Hurray).
So as others have already suggested in the comments to your question, why not use a seamless noise image? (you know where to find one :P).
You could still embed this image as 'DATA' in the html(, HINT: even or even feed image-data straight into canvas that will render it as your 'noise.php').
But then.. you are right: what if you wanted to change the noize-size?
And you want to know how to make granite/noise anyway..
And is mathematically/programmatically describing this noise lower in character-count (file-size) than supplying a ready-made image(-fragment)?
Start UPDATE 2 part 1:
Actually, after some good night sleep I realized/remembered that visual noise is one of the BEST way's to determine randomness. Humans are notoriously good at finding visual patterns, even professionals use this (and as such this is also heavily used in cryptography where one would need -for instance- a useful one time pad).
Also see 'commander' Crockford's YUI-lecture 'Principles of Security' from 19m07s to 22m37s.
Now why is this important? Well ECMA-script (aka javascript) defines a loose Math.random() function:
"returns a number value with positive sign, greater than or equal to 0 but less than 1, chosen randomly or pseudo randomly with
approximately uniform distribution over that range, using an
implementation-dependent algorithm or strategy"
Re-read the italic/bold part and welcome yourself to reality: each and every browser (brand/version) has it's own random-routine!!
"But what does it mean?" Well.. simply put.. Depending on browser(version)'s ES-Script implementation (cough cough IE): Noise based on Math.random() will/might render visible patterns in your noise (independently of possible tile-size)!!
So for the rest of this answer we are going to assume either an ideal world where browsers spit-out proper random numbers, or that you took control and use a stronger 'predictable' random-solution as is discussed on this wonderful article that google's bubble accidentally leaked :)
End Update 2 part 1
So let's start with the radial gradient-part. You already figured that one out.
Ok, then follows the noise-function in canvas (you could you could do this before the radial gradient, but this order gives a nicer grain and diffuses color banding the gradient produces -on a average lcd you would see them anyway since they're not true color-) : this is done by generating random pixels.
There would be a lot of different algorithms to use, I've used a straight-forward one that you can understand without math..
Note that generating noise for a modern day full-screen resolution is easily larger than 1 mega-pixel in resolution, so this would be slow! To overcome this we need to generate and RE-USE a small seamless tile. We use this as a pattern-fill in our full-size image that already has the radial gradient.
I also assume you want the radial gradient liquidly placed in the middle of the view-port, so if you want to go the fixed way (as opposed to the noize.png/css3 way you referenced), you'll also need an extra eventhandler 'onResize()' to have canvas render a new background.
Why? Well if you where to let the browser scale this background-image (created upon page-load) automatically, then the nice grain-size of your noise would change to, EVEN leading to visible PATTERNS that you would not want..
(Since I desperately want to go to sleep now..): The rest is thoroughly explained in the source-code of the function I wrote for you..
Here is the link to the fully documented code I wrote for you: jsfiddle.net/sU74C/ and here you can see it in full-screen preview.   UPDATE 1: function genNoise 80% FASTER!!
Use it if you like (retaining the link to this answer) or learn from it and do your own thing.
PLEASE DON'T FORGET to accept AN answer to this question (hopefully mine :))
Hope this helps!
UPDATE 2 part 2:
There are more way's to interact with canvas. One could also calculate/(re-)use/generate/save/import pixel-maps/array's (as png or base64 or jpg or ...) for instance, see this excellent article on faster 8bit and even faster 32bit (if the browser supports 'Uint8ClampedArray' as the type of the data property of the ImageData object) pixel-array's, including a proper solution to account for the endianness of the processor!!
So after giving this some considerable thought, it turns out that to do this 'right' is actually a challenge and should be divided in 2 parts:
Where do I get my noise-data (Math.random() or custom random or pre-defined external (image, json-string, random.com) or embedded (packed?) data)?
What is the fastest way to build/store/re-use this noise on full-screen size/canvas.
Given the statements in part 1 of this update and that we don't want patterns in our visible noise, I'm starting to lean to using some pre-rendered 'random' noise data (meant to tile seamlessly) that is embedded in the noise-generator: otherwise there is the overhead of running your own none-engine-optimized random function (times..a lot..).
Also I think one might get away with just black and white and transparency afterwards.. This might considerably speed-up things up AND reduce embedded pixel-data.
Think about it: black or white equals 0 or 1..
In base 64 one character can represent 6 bits. So a 30x30px image has 900 px divided by 6 bits = 150 characters (sweet-spot increments by 6px, so next is 36px*36px is 216 characters).

How to make tesseract to give relevant results in the presence of noise?

I am using tesseract 3.0.0 and I bumped into the following problem:
When there is something too small for tesseract to recognize it seems it's merged with
other fragments. As a result nothing relevant is returned.
The image below shows 3 cases. Only the rectangle with the dashed line is passed to tesseract. Over the rectangle is the result (V over T means new line).
The last case is the problem one. Is there someway to improve tesseract in situations like this?

As far as I know, Tesseract does not have proper image segmentation yet (or Document Analysis, as it is called in commertial OCR applications.) Typically, before OCR is done, image is get's split on separate areas that contain text, pictures, barcodes, lines and so on. Then you apply OCR only on text ares and don't face problems you have just described.
Earlier versions of Tesseract did not have that functionality at all, and Tesseract was supposed to be used as line recognizer only, or so called field-level recognizer, when you use it on small snippets of text cut from bigger image.
I did not followed throughly what was introduced in 3.0, probably it is already there partially, but obviously it does not work as expected, as you have just found out.
There is another opensource project - OCRopus, that aproached this problem exactly as I described - first Document Analisys (aka Segmentation) and only then OCR. Their earlier versions were actually using Tesseract for OCR after analisys step finished. But later they introduced their own OCR (which is still not very good) and moved Tesseract plugin support down in priorities list.
Here's what you actually can do to address your problem:
If your images have very typical structure, you can try to do some dumb segmentation and cut text from the image yourself before passing it to Tesseract. However, if you expect to have wide variety of images to be supported, just forget it.
You can ckeck OCRopus and see if their segmentation work for your images. If yes, then you can spend some time to make OCRopus + Tesseract work together.
Well, if what you do is not just for fun and you value your time, I would recommend thinking about real OCR engine like ABBYY. You will get much higher accuracy of both segmentaiton and OCR out of the box, and professional customer support of course.
Disclaimer: I work for ABBYY

How to find pixel co-ordinates of corners of a square pattern?

This may not be a programming related but possibly programmers would be in the best position to answer it.
For camera calibration I have a 8 x 8 square pattern printed on sheet of paper. I have to manually enter these co-ordinates into a text file. The software would then pick it up from there and compute the calibration parameters.
Is there a script or some software that I can run on these images and get the pixel co-ordinates of the 4 corners of each of the 64 squares?

You can do this with a traditional chessboard pattern (i.e. black and white squares with no gaps) using cvFindChessboardCorners(). You can read more about the function in the OpenCV API Reference and see some sample code in O'Reilly's OpenCV Book or elsewhere online. As an added bonus, OpenCV has built-in functions that calculate the intrinsic parameters of the camera and an array of extrinsic parameters for the multiple views of a planar calibration object.

I would:
apply threshold and get binarized image.
apply SobelX filter to image. You get an image with the vertical lines. This belong to the sides of the squares that are almost vertical. Keep this as image1.
apply SobelY filter to image. You get an image with the horizontal lines. This belong to the sides of the squares that are almost horizontal. Keep this as image2.
make (image1 xor image2). You get a black image with white pixels indicating the corner positions.
Hope it helps.

I'm sure there are many computer vision libraries with varying capabilities and licenses out there, but one that I can remember off the top of my head is ARToolKit, which should be able to recognize this pattern. And if that's not possible, it comes with a set of very good patterns that are tailored so that they can be recognized even if they're partially obscured.

I don't know ARToolKit (although i've heard a lot about it) but with OpenCV this processing is trivial.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008