I have used Google Cloud Vision API for document text detection, but I could not figure out if it lets us define a particular area of image from which to extract text.
For example if my image has 3 columns of text and I want to provide top-left coordinates, width and height of a particular column on which I want to perform OCR. Is it possible?
Also is there any other way to not get jumbled up text when we have 3 columns of text in image?
Currently, It is not possible to define a particular area of image from which to extract text. There is no available parameter for that in the image context in neither the REST or gRPC APIs. A Possible workaround is to crop your image and send only the text you want to transcript. If you want to try to automate this process, perhaps the object localization or the crop hints features may be of use.
Regarding the jumbled up text, you may be able to locate each block or paragraph in the Json response.
You can builder your own wrapper class around the Detector class. Then re-build the bitmap in the frame object that gets fed into the detect method.
Related
I have a dwg (I can change the file format to svg or other formats if needed) file that I want to show on my web page. After this file is displayed in the page, I want to be able to zoom in, zoom out, pan and put links that request the api where necessary. In fact, it looks like information and links about the relevant place appear when you hover over the landmarks on Google maps.
How should I go about doing this job?
If you can translate the DWG file to DXF (there are several tools to do this) then you can use MapServer to render it on a web map compatible way. Have a look at MS4W for an easy way to install and configure MapServer on Windows. Since you want pop-ups, I'd recommend using Leaflet as the client side browser tool kit for providing pan, zoom, pop-ups etc to communicate to MapServer. figuring out the coordinate system of the DWG file will likely be the hardest part.
Use this method if integrating your DWG with other mapping data (roads, etc) is important. Otherwise look for something easier to implement.
There are libraries that let you easily zoom and pan an SVG image. For example svg-pan-zoom.
As for the links, you would need to do a bit of extra work. IIRC DXF files don't have the concept of a whole element that you could hover over. All the lines in the file are discrete object. So, if I am remembering that correctly, you may need to load the SVG into an editor and add elements on top of the diagram that correspond to your hover areas. They don't need to be visible. They can be transparent and still hoverable. You just need to then add the interactivity. Ie.
Optionally add hover effects with CSS.
Add a mouseover or click event handling to implement the link handling, or
use standard SVG <a> linking instead.
I would like to create a webpage that displays one image at a time with shapes and annotations shown over it. Each image should have a unique URL address in the address bar.
I have a set of images as PNGs. For each image there is a set of shapes (rectangles, polylines - as pairs of x,y coordinates in pixels) and annotations (location on the image x, y in pixels and a short text as string) to be displayed over it.
When user loads the unique url for an image they should see it with shapes displayed on top of it and annotation markers shown as circles. When user presses a button labeled „next“ the page loads the next PNG with its corresponding shapes and annotations. User can click on an annotation marker and a text balloon should open to show the text for the annotation.
How to approach developing this? I am asking since I don’t have an overview of web app frameworks or the current best practices for databases and graphic formats for online content.
I have experience in programming - Python, datascience, Procedural geometry, Unity for game development (C#), Lua - but not for the web.
I can do a WebGL app using Unity to do what i want and link it to a MySQL database but it feels like shooting a fly with bazooka. Maybe there is an easier, simpler way. Any advice or tips would be appreciated.
The best tip, I think, is in one of the tags of your question: canvas.
Images can be drawn onto it, as well as shapes. To handle your markers you could compare the coordinates of a click on the canvas to the coordinates of your markers (see this post for further info on getting the click coordinates). The annotation balloons could be realised by hiding or showing a simple <div> on click.
As for web app frameworks, depending on how fancy it should be you could do it "by hand" in native html and css, or use something with nice predefined components, like Bootstrap.
Depending on where you will be hosting this webpage, you will, or will not have to worry about the backend/server. There are numerous hosting providers available that provide user-friendly admin panels.
I have the next document image
When I try to convert the image to text, the result is the next:
Top Text
Ref: Rad: Dte: Ddo:
Ejecutivo 76520400300 Banco de Bogotá Luz Adriana
Botton Text
The problem is Google API recongnize it like two columns so, How can I config the Google API in order to obtain one column text?
My goal is obtain:
Top Text
Ref:Ejecutivo Rad: 76520400300 Dte: Banco de Bogotá Ddo:Luz Adriana
Botton Text
Google team member responded that Document AI works better than Cloud Vision as per the update on the issue
Cloud Vision API doesn't have a specific request property to specify the format used to read or sort the file's data. Instead, I think that the available workaround is to use the BoundingPoly and Vertex response properties, that display the coordinates related to each word contained in the image, in order to process the vertices data within your code logic and define the text that need to be grouped by columns and rows. You can take a look on this link which includes some response examples that include these properties.
In case this feature doesn't cover your current needs, you can use the Send Feedback button, located at the lower left and upper right corners of the service public documentation, as well as take a look the Issue Tracker tool in order to raise a Vision API feature request and notify to Google about this desired functionality.
Is there a way to insert the image overlay a layer below the streets but on top of the map background? The roads can be individually styled, so it should technically work, but I haven't been able to find the option for it.
The only lead I have found so far is this question: Google Maps API - Overlay Custom Roads
Which unfortunately doesn't really solve the problem of having to manually enter the street info.
I'm currently working on a custom map for a whole city and manually illustrate all the streets and enter the street names would take an enormous amount of time.
Any info would be very appreciated, thanks!
Try to check this documentation about Styled Maps. Styled maps allow you to customize the presentation of the standard Google base maps, changing the visual display of such elements as roads, parks, and built-up areas.
Here you can also find some sample code that you can use in your sample code.
Also you can find here the Styled Map Wizard.
Creating styles by hand and testing your code to see how they look is potentially time-consuming. Instead, you can use the Styled Map Wizard to set up the JSON for your map's styles. The wizard allows you to select features and their elements, apply operations to those features, and save the styles to JSON, which you can copy and paste into your application.
I'm trying to integrate my own markers as pointers on my map. The defaults of circle, rectangle diamond etc... are not what I need and I'm looking for arrow symbols instead. Ideally the popular Microsoft wingdings arrows. I'm surprised simple arrows are not on the default list, I'd thought there would be many a need to indicate a rise or fall with any numeric data on a map.
I would like to solve this with an expression to force an arrow icon as a marker, can this be done by using it's character code etc..? I'm using SSDT to design the report.
Alternatively I'll just have to do this in paint and upload via the image import.
Food For Thought
I see they've done a great job in making the map process easy to set up, but when it comes to customisation from the norm it is extremely difficult.
TechNet: Understanding Marker type Rules:
http://technet.microsoft.com/en-us/library/ee240825.aspx
As you can see from the link (Which is one example as not to swarm this post with links) Microsoft make no mention of image upload or Expression input for maps. This I find is one example, the maps are great, but I feel it's difficult to get documented resources to further customise my report.
You'll need to use an image of an arrow for your custom marker, and you will still be able to change other attributes of it (size, transparency, etc).
If you use a custom image marker, you may run into problems where Visual Studio fails to render the map in design mode from time to time - it's incredibly annoying, so I find it best to drop in the custom images as the very last thing I do when building a map (just use a circle marker or something in the interim).