Is it possible to extract the text from an image like this?
(I'd like to display it in an textfield afterwards)
Thanks.
Uli
What you're looking for is Optical Character Recognition. Here is a similar question:
OCR Actionscript
Though sadly it has no clear-cut answer. There is no native class/framework for doing it in AS3, though I'm sure it's possible.
This is a task where you'd employ web service. I know Google Docs can OCR an image for you. ABBYY, whose FineReader is one of the best in the business, also provides an OCR web service. Google has open-sourced their OCR software. You can conceivable set it up on your own server.
Related
I've been around Google Vision API but I have a problem I can't really
solve. This is the image I'm dealing with:
In the image above, Google Vision API (also happens with IBM (Watson) and
Microsft (Cognitive Services)) does not understand that 2,99€ is something to read because it is not treated as a single line, so
the output is all but what I expect him to do (understand the price of
the label).
If I was using Tesseract, I would solve this by using the -psm 7 option in order to force it to read it as a single text line, but I can't really find documentation for this situation using Google Vision API.
Has anyone done something similar before? I cannot figure out how to solve this problem...
I have a similiar problem and it appears that the Vision API might not be the right fit for this kind of problem. The API does not give you any information about the structure of the found text (other than the rectangulkar where the text is found) and in turn also does not care about the structure.
AFAIK you cant solve this problem with the vision API yet, although there might be some sort of solution in the future.
Right now there ist the "ImageContext" part of the AnnotateImageRequest which I hope will be used for exactly what you are trying to do in the future.
I'm looking for a OCR library that allows me to read text in an image, but only text that is circled. I want to get some feedback on Tesseract OCR for this task. It looks powerful but complex. HOw would it be used here, can I be trained for something like this? or should have to be extended?
Yes, Tesseract is fully trainable. And it just happens that it supports text in a circle also (pagesegmode 9). Give it a try.
I would like to take a pdf of a scanned graph paper notebook (with handwriting) and turn it into a text file.
How can I do this?
Thanks
Check out an OCR library, like OCRopus. I don't think it takes PDF, so you may have to convert it to a TIFF or JPEG first.
There are OCR libraries that convert typing (OCRopus, tesseract, etc.)
There are also Java based handwriting libraries. I am not sure if OCRopus has that ability, one library I was looking into to do handwriting recognition was:
Online Video
Java Neural Networks
Conceivably you could take the pdf, convert it into a tiff if need be (according to the software), and it would give you something..
Good luck!
If it is the notebook as a PDF file you could e-mail it to a gmail account and then gmail allows you to "view" the PDF from within your browser as an HTML file. Still the pages remain images.
If you would like the text out of it OCR might work but it may also be uncapable of getting the text out of it.
i have this GUI screen shots from the design team which i needs to convert to a web page and what not. i'm thinking of finding some website which resembles the GUI so that i can copy and paste the html so i don't have to start from scratch. the only drawback about this method is i don't know what website actually looks like that so that might means a lot of browsing time. hehe.
So just wondering if there's a tool which can help me do the search? Or even better yet if there's a tool which can convert image into html web page equivalent, that would be even better.
i guess i'm just another lazy uncreative programmer trying to get the gui part done quick and dirty, hehe.
thanks.
You mean you have a PNG, GIF or JPG screen-shot... and you want to feed that into a program and have it spit out a collection of HTML and CCS which when viewed in a browser would look just like that image?
I'm sorry to burst your bubble, but I would be very, very, very surprised if this was the case.
It's basically just impossible. If you see a box on the screen, it could be a text area, or a div, or a td, or a gif, or any one of fifteen different things. There's no way at all a program could every figure out which HTML element to use.
I'm sorry, but you're going to have to write HTML yourself. A tool like Dreamweaver will help speed the process if you're new to HTML. But I'll bet ya two bits to a farthing that there's nothing on the planet which will automate this job.
Not the answer you wanted, sorry. But it's the answer.
I am pretty sure that you can use Adobe Dreamweaver to do this - going from design to HTML.
You mention that they used Fireworks to do the design. Is that Adobe Fireworks? If so, that application has the option of outputting the design into HTML for you.
From the feature list:
Design once, deploy to many platforms
Output Fireworks designs to HTML or the application of your choice: Adobe Flash, Adobe AIR™, or Adobe Flex®. Craft custom skins with exceptional design tools. Now your tools will play well together. Design within Fireworks and then export standards-compliant CSS-based layouts — complete with external style sheets — to Dreamweaver CS4. Create components in Fireworks for use in Adobe Flex Builder™ software. Create HTML-based Adobe AIR prototypes directly from Fireworks.
I'm not aware of any tool that can convert an image of a GUI into the equivalent HTML. I would imagine that would be very dependent on the styling of the GUI anyway. On top of that, it would probably produce some pretty nasty and nightmarish-to-maintain HTML.
It's probably not the answer you want to hear, but I'd start coding.
Did the design team do the design in HTML? Photoshop mockups? Both of those would give you a better shot at avoiding hand-coding time.
From a screenshot?
You probably don't wanna do this. You want probably want, at minimum, the file saved as jpeg or png, or any format, preferably with no compression so you dont lose quality.
Then you can slice it. Google "photoshop slicing tutorial" and tons will appear.
At best, you want the PSD file, which you can then slide it up and hide/show layers of things you dont want, etc.
One challenging topic in computer vision is processing document scans. Typically this involves a number of steps, like noise removal, color analysis, binarization, text block identification, OCR, and then maybe some context analysis and correction.
I'm curious if anyone understands, knows or can point me to literature on how Google identifies text blocks prior to the OCR stage. Any insights?
I believe Google uses the Tesseract OCR engine in conjunction with another tool called Ocropus, both of which are open-source. I don't know anything about how they work but you may be interested in checking out the code, available at the above links.
This is second-hand information from the digitization specialist in my library, but it seems that Google's approach is to just throw everything through the automated process, ocr anything that looks like text and to not fuss too much about cropping individual images or doing much semantic analasys to look for image captions, etc. They may be doing subtle things that aren't obvious but on the surface they are definitely gunning for quantity over quality, which is smart for them to do for their purposes, IMO.