Sample Labeling Tool OCR Text Detection Problems - ocr

I have a question regarding Azure Form Recognizer's OCR with handwritten text.
When running OCR on handwritten PDF files before labeling in Azure's Sample Labeling Tool, the OCR often detects text incorrectly. With other form analysis and extraction technologies, an option is often provided to enter the text that was supposed to be detected to essentially "correct" the OCR. For training Azure Form Recognizer in the Sample Labeling Tool (Docker image), I do not see a way for me to override the OCR text and enter the correct text.
Is there a way I can enter the text myself that the OCR is failing to detect or detecting incorrectly?
For example, the image below is what the OCR in Azure's Sample Labeling Tool picked up:
OCR detection sample image.
Is there a way to correct this result and tell Form Recognizer that the text should be: "Bridget Sims, MD"?

currently there there is no way to correct the OCR result and improve its accuracy right away. The typical scenario is to train a form recognizer model from a small set of training files, and use it to process more documents. During training, small amount of OCR errors are not essential to the model quality, you could ignore them. The product team is working on a new version of OCR with better handwriting recognition accuracy.
thanks
-xin
[Microsoft Azure Form Recognizer Team]

Related

What need for an automatic design project? My project is very slow

After showing the house to the forge viewer using a Ravit file, the user wants to modify the contents of the viewer and receive it as a Ravit file again. What function should I use to implement the above?
https://learnforge.autodesk.io/#/
I referred to it.
my project
I make input data and transfer this to design automation and get output file and file to translate viewer (this is 2 minutes).
This process is very slow I want better then this process.
https://www.autodesk.com/autodesk-university/class/Its-Not-Too-Late-Automate-Using-Forge-Design-Automation-Inventor-2021#video
This video looks very fast on work process. How to get fast work process like this video?
Thank you
The Forge Viewer is a viewer, not an editor.
The Forge viewer displays the translated version of a seed CAD model, in this case, a Revit RVT BIM.
You cannot edit or modify the CAD seed file in the viewer.
To achieve such a modification, you have to either use the original modelling software, in this case, Revit on the Windows desktop, or you can use the Forge Design Automation API for Revit.
That is what was used to create the Inventor sample you refer to.
Oops... re-reading your question, I see that you are already using design automation yourself as well. Congratulations on that.
However, there is no guarantee on the turn-around time for this process. The video may very well have been edited to eliminate a waiting period, of the user creating the video may just have been very lucky to achieve a faster turn-around time.
I am checking for you with the Forge team whether they used any additional tricks to speed things up in the video or in the true real-time processing. They confirm:
For Revit, a basic DA job should take up to 30-40 secs for the processing time alone. The derivative job for translation to viewer format could take another minute. So, 2 mins is expected. The sketch it demo video has a timer on the side to indicate real time.

Return an OCR'd PDF File (with text overlay) from Azure Cognitive Services Read

I have implemented Azure Cognitive Read service to return extracted/OCR text from a PDF.
However, to make it easier for the user to understand the context/copy and paste data from the PDF i would like to overlay that text data over the PDF. I would then drop that PDF into a viewer.
Does anyone have any ideas on how to proceed. Would also be happy to use AWS. Basically an API that i submit a PDF to and it returns an OCR'd PDF is what i am after. If this is not possible, a library that i can submit the text and the PDF (and return a text searchable PDF) is also ideal.
I am looking for something similar and stumbled upon this:
https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/overview?tabs=v2-1
This is the Azure form recogniser.
What is Azure Form Recognizer?
Azure Form Recognizer is a part of Azure Applied AI Services that lets you build automated data processing software using machine learning technology. Identify and extract text, key/value pairs, selection marks, tables, and structure from your documents—the service outputs structured data that includes the relationships in the original file, bounding boxes, confidence and more. You quickly get accurate results that are tailored to your specific content without heavy manual intervention or extensive data science expertise. Use Form Recognizer to automate data entry in your applications and enrich your documents search capabilities.
They have an online example test:
https://fott-2-1.azurewebsites.net/prebuilts-analyze
Create a service in azure for free and test if it fits your needs. From there you will get a json reply and you can use the boundingBox to display. I haven't gone as far as applying the bounding box or something like that to the pdf.
You also have a free development nuget package to do exactly as you want:
https://ironsoftware.com/csharp/ocr/#sample-tesseract-create-searchable-pdf
OCR development is free for testing and see if it works for you.

Azure Detection with Custom Vision and OCR text detection

i'm studying the Azure Custom Vision service for object detection, but I would like also to extrapolate text information within a tagged image zone.
Is it possible with Custom Vision?
If not, is it in the service roadmap?
Thank you

abbyy cloud ocr SDK

I am working on a project for school and I need an OCR, I downloaded the free trial from abbyy cloud ocr sdk but after reading all the documentation and API I still don't understand how to use the cloud service. Did anyone have an experience using that tool and can explain to me how it works or send me a demo code for how to use it?
thanks!
Actually, dragging and dropping files onto the GUI should work.
Make sure that you have set the right recognition language. The default one is English.
If some short text fragments are not recognized and you don’t need to save any pictures and tables, you can try to use the textExtraction profile. The request URL should be like http://cloud.ocrsdk.com/processImage?profile=textExtraction&exportFormat=pdfSearchable.
If the document quality is poor, you can try to use field-level recognition (when text coordinates are specified directly). It could be tested using the same GUI sample (second tab).
Note that the recommended resolution for font size 12-16pt is about 300 dpi (more source image recommendations).
To get more detailed recommendations, you can send your images and Application ID to CloudOCRSDK#abbyy.com.

Google maps mapsgl vector format or other map tile vector formats

I was wondering if anyone knew anything about the new Google maps mapsgl format for their vector data. I have worked some with open street maps data and rendering it to raster tiles with Mapnik. I noticed Mapnik can also render to svg file as vector data, but the uncompressed svg files are bigger then the raster images. After seeing the new mapgl thing from Google I was wondering what they did or anyone else for vector data that is chucked up in tiles. I would like to know of any other data formats that might be used for storing open street maps data in as vector data that can be rendered quickly. Seeing how Google maps mapgl is working in a web app I would be interested in any detail of how they did it.
My current focus would be rending the data with a desktop program using OpenGL, but it would be ideal if the formats could work on the web or mobile apps.
Don't mix up geographic vector data formats with SVG. SVG is intended purely for graphic rendering and its semantics doesn't know anything about the source geographic data. So SVG definitively not a good format to keep your geo data (and it's too verbose anyway).
What you would need is some kind of binary format (better suited for desktop apps) or very terse JSON (better for Web clients) to store OSM data in.
I suggest reading this QA: https://gis.stackexchange.com/questions/15240/how-to-create-vector-polygons-at-the-same-amazing-speeds-giscloud-is-able-to-ren
There are also some attempts to formulate a binary OSM protocol, but I don't know what state these projects are in:
http://wiki.openstreetmap.org/wiki/OSM_Binary_Format
http://wiki.openstreetmap.org/wiki/OSM_Mobile_Binary_Protocol
Mapbox made "Vector tiles" and that is the answer I was looking for.