How to group blocks that are part of a bigger sentences in Google Cloud Vision API? - ocr

I am using Google Cloud Vision API on Python to detect text values in hoarding boards that are usually found above a shop/store. So far I have been able to detect individual words and their bounding polygons' coordinates. Is there a way to group the detected words based on their relative positions and sizes?
For example, the name of the store is usually written in same size and the words are aligned. Does the API provide some functions that group those words which probably are parts of a bigger sentence (the store name, or the address, etc.)?
If the API does not provide such functions, what would be a good approach to group them? Following is an example of an image what I have done so far:
Vision API output excerpt:
description: "SHOP"
bounding_poly {
vertices {
x: 4713
y: 737
}
vertices {
x: 5538
y: 737
}
vertices {
x: 5538
y: 1086
}
vertices {
x: 4713
y: 1086
}
}
, description: "OVOns"
bounding_poly {
vertices {
x: 6662
y: 1385
}
vertices {
x: 6745
y: 1385
}
vertices {
x: 6745
y: 1402
}
vertices {
x: 6662
y: 1402
}
}

I suggest you to take a look on the TextAnnotation response format that is applied when using the DOCUMENT_TEXT_DETECTION for OCR recognition request. This responses contains detailed information about the image metadata and text content values that can be used to group the text by block, paragraph, word, etc, as described in the public documentation:
TextAnnotation contains a structured representation of OCR extracted text. The hierarchy of an OCR extracted text structure is like this: TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol
Additionally, you can follow this useful example where is shown how you can organize the text extracted from a receipt image by processing the fullTextAnnotation response content.

Related

Segmenting on Arcs from DWG File

I have an application using the Forge Viewer displaying converted ACAD dwg files. The short description is that I need to take specific polylines out of the dwg file source and use the Edit2D extension to draw them as polygons over the background. I have this working, but arcs are causing some issues right now. This doesn't have to be perfect but it should be decently the same shape. In most cases it is just drawing a line from the start to the end of the arc (and I understand why, see code below) but in other cases it's significantly segmenting the arc and I'm not sure why.
I start by finding the id's of the polylines based on their layer and then getting the fragment ids (this is working fine). Then I get the vertexes for the polyline like this:
export function getVertexesById(
viewer: Autodesk.Viewing.GuiViewer3D,
frags: Autodesk.Viewing.Private.FragmentList,
fragIds: number[],
dbId: number
): Point[] {
// We need to also get the center points of arcs as lines seem to be drawn to them in the callbacks for some
// reason. Center points should later be removed from the point array, so we don't get strange spikes on our shapes.
const polyPoints: Point[] = [];
const centers: Point[] = [];
fragIds.forEach((fid) => {
const mesh = frags.getVizmesh(fid);
const vbr = new Autodesk.Viewing.Private.VertexBufferReader(
mesh.geometry,
viewer.impl.use2dInstancing
);
vbr.enumGeomsForObject(dbId, {
onLineSegment(
x1: number,
y1: number,
x2: number,
y2: number,
_vpId: number
) {
checkAddPoint(polyPoints, { x: x1, y: y1, z: 0 });
checkAddPoint(polyPoints, { x: x2, y: y2, z: 0 });
},
onCircularArc: function (cx, cy, start, end, radius, _vpId) {
centers.push({ x: cx, y: cy, z: 0 });
},
onEllipticalArc: function (
cx,
cy,
start,
end,
major,
minor,
tilt,
_vpId
) {
centers.push({ x: cx, y: cy, z: 0 });
},
onOneTriangle: function (x1, y1, x2, y2, x3, y3, _vpId) {
checkAddPoint(polyPoints, { x: x1, y: y1, z: 0 });
checkAddPoint(polyPoints, { x: x2, y: y2, z: 0 });
checkAddPoint(polyPoints, { x: x3, y: y3, z: 0 });
},
onTexQuad: function (cx, cy, width, height, rotation, _vpId) {
centers.push({ x: cx, y: cy, z: 0 });
},
});
});
centers.forEach((c) => {
checkRemovePoint(polyPoints, { x: c.x, y: c.y, z: 0 });
});
return polyPoints;
}
The functions checkAddPoint and checkRemovePoint are just helper functions that make sure we don't duplicate points and take into account rounding (so we don't get two points that are say 0,0,0 and 0,0.00001,0.
I then use those points to draw with the Edit2D extension. So what I would expect here is that it creates a series of points that would draw along all the straight lines of the polyline and when it gets to an arc it just draws from one endpoint to the other. That is mostly what I see.
Here is an example file as it looks in ACAD:
Notice there are a handful of breaks in the arc around the outside of the room. What I get when I do the above process is this:
Notice all along the top I get what I would expect. However, along the bottom in 2 places I get a huge number of segments all along the line.
I looked back at the ACAD file and exploded the polyline and looked at it as much as I know how and I can't find anything different about those two segments vs. the other that would indicate why it acts differently.
What would be really awesome is if there is an easy way to just segment along an arc say every x units and have it return that but I'm not expecting that here, I just want to know why it is treating these differently.
Any help is greatly appreciated.
Edit
I should also mention that I have logged the creation routine and the only things in this that are ever hit are the onLineSegment and onCircularArc. As you see, the circular arc one only checks to make sure we don't have the center point in the list, so all of these extra points are for some reason being read in the line segment section.

Raster to Point conversion: how to convert every single pixel? on Google Earth Engine

Good day. I am trying to convert a raster to points using Google Earth Engine. My raster has one band (clusters) and it has been clipped to my ROI. I am aware of the reduceToVectors function on Google Earth Engine, but as I understand, this function creates areas with the same adjacent value, whereas what I want is to create as many points as there are pixels.
So far, I have tried different versions of:
var vectors = image.reduceToVectors({
reducer : null,
geometry : treat,
scale:30,
crs :image.projection().getInfo().crs,
geometryType : 'centroid',
labelProperty : 'null',
eightConnected: false,
maxPixels: 1e15
});
Thanks a lot for your help.
ee.Image.sample returns a point for every pixel.
var vectors = image.sample({
region: treat,
geometries: true, // if you want points
});
If you do not specify a scale and crs, it will use each pixel in the input image's original resolution. If you do, it will sample at the given scale instead.
Demonstration script:
var region = ee.Geometry.Polygon(
[[[-110.00683426856995, 40.00274575078824],
[-110.00683426856995, 39.99948706365032],
[-109.99576210975647, 39.99948706365032],
[-109.99576210975647, 40.00274575078824]]], null, false);
var image = ee.Image('CGIAR/SRTM90_V4');
Map.setCenter(-110, 40, 16);
Map.addLayer(image, {min: 1000, max: 2000}, 'SRTM');
var vectors = image.sample({
region: region,
geometries: true,
});
print(vectors);
Map.addLayer(ee.FeatureCollection([region]).style({"color": "white"}));
Map.addLayer(vectors);
https://code.earthengine.google.com/625a710d6d315bad1c2438c73bde843b

plotting maps using OSM or other shapefiles and matplotloib for standardized report

We are developing a standardized report for our activities. The last graph I need is to display the geographic area of the activities (there are close to 100 locations).
The output for these reports is PDF letter or A4 size
The report is a mplotlib figure, where:
fig = plt.figure(figsize=(8.5, 11))
rect0 = 0, .7,, 0.18, 0.3
rect1 = .3, .7, .18, .3
rect2 = .8, .29, .2, .7
rect3 = 0, 0, .8, .4
ax1 = fig.add_axes(rect0)
ax2 = fig.add_axes(rect1)
ax3 = fig.add_axes(rect2)
ax4 = fig.add_axes(rect3)
The contents and layout for axes 1-3 are settled and work great. However ax4 is where the map contents would be displayed (ideally).
I was hoping to do something like this:
map1 = Basemap(llcrnrlon=6.819087, llcrnrlat=46.368452, urcrnrlon=6.963978,
urcrnrlat=46.482906, resolution = 'h', projection='tmerc',
lon_0=6.88, lat_0=46.42, ax=4)
map1.readshapefile('a valid shape file that works') #<----- this is the sticking point
map1.draw(insert locator coordinates)
plt.savefig(report to be inserted to document)
plt.show()
However I have not been successful in obtaining a shape file that works from open street maps or GIS.
Nor have I identified the correct process to transform the data from openstreetmaps.
Nor have I identified the process to extract that information from the OSM/xml document or the transformed GeoJSON document.
Ideally I would like to grab the bounding box information from openstreetmaps and generate the map directly.
What is the process to get a shapefile that works with the .readshapefile() call?
Or alternatively how do I get the defined map into a Matplotlib axes ?
It might be easiest to use the cartopy.io.img_tiles module, which will automatically pull the OSM tiles for use with cartopy. Using the pre-rendered tiles would negate the trouble of handling and styling individual shapefiles/XML.
See the cartopy docs on using these tiles within cartopy.

Graphhopper: Cannot create location index when graph has invalid bounds

I am using graphhopper 0.8 via maven in my java project. I create a network with the folling code
FlagEncoder encoder = new CarFlagEncoder();
EncodingManager em = new EncodingManager(encoder);
// Creating and saving the graph
GraphBuilder gb = new GraphBuilder(em).
setLocation(testDir).
setStore(true).
setCHGraph(new FastestWeighting(encoder));
GraphHopperStorage graph = gb.create();
for (Node node : ALL NODES OF MY NETWORK) {
graph.getNodeAccess().setNode(uniqueNodeId, nodeX, nodeY);
}
for (Link link : ALL LINKS OF MY NETWORK) {
EdgeIteratorState edge = graph.edge(fromNodeId, toNodeId);
edge.setDistance(linkLength);
edge.setFlags(encoder.setProperties(linkSpeedInMeterPerSecond * 3.6, true, false));
}
Weighting weighting = new FastestWeighting(encoder);
PrepareContractionHierarchies pch = new PrepareContractionHierarchies(graph.getDirectory(), graph, graph.getGraph(CHGraph.class), weighting, TraversalMode.NODE_BASED);
pch.doWork();
graph.flush();
LocationIndex index = new LocationIndexTree(graph.getBaseGraph(), graph.getDirectory());
index.prepareIndex();
index.flush();
At this point, the bounding box saved in the graph shows the correct numbers. Files are written to disk including the "location_index". However, reloading the data gets me the following error
Exception in thread "main" java.lang.IllegalStateException: Cannot create location index when graph has invalid bounds: 1.7976931348623157E308,1.7976931348623157E308,1.7976931348623157E308,1.7976931348623157E308
at com.graphhopper.storage.index.LocationIndexTree.prepareAlgo(LocationIndexTree.java:132)
at com.graphhopper.storage.index.LocationIndexTree.prepareIndex(LocationIndexTree.java:287)
The reading is done with the following code
FlagEncoder encoder = new CarFlagEncoder();
EncodingManager em = new EncodingManager(encoder);
GraphBuilder gb = new GraphBuilder(em).
setLocation(testDir).
setStore(true).
setCHGraph(new FastestWeighting(encoder));
// Load and use the graph
GraphHopperStorage graph = gb.load();
// Load the index
LocationIndex index = new LocationIndexTree(graph.getBaseGraph(), graph.getDirectory());
if (!index.loadExisting()) {
index.prepareIndex();
}
So LocationIndexTree.loadExisting runs fine until entering prepareAlgo. At this point, the graph is loaded. However, the bounding box is not set and kept at the defaults?! Reading the location index does not update the bounding box. Hence, the error downstreams. What am I doing wrong? How do I preserve the bounding box in the first place? How to reconstruct the bbox?
TL;DR Don't use cartesian coordinates but stick to the WGS84 used by OSM.
A cartesian coordinate system like e.g. EPSG:25832 may have coordinates in the range of millions. After performing some math the coordinates may further increase in magnitude. Eventually, Graphhopper will store the coordinates as integers. That is, all coordinates may end up as Integer.MAX_VALUE. Hence, an invalid bounding box.

google maps v3: center of bounds is different from center of the map

How is it possible that map.getCenter() might be different from map.getBounds().getCenter()?
> cragMap.getCenter()
> Q {d: 13.823563748466814, e: 0, toString: function, b: function, equals: function…}
> cragMap.getBounds().getCenter()
> Q {d: 5.9865924355766005, e: 0, toString: function, b: function, equals: function…}
This happens in my case and prevents me from coding one particular feature. Any idea what is the cause of this?
It is caused by latitude non-linearity of mercartor projection. map.getBounds().getCenter() returns average of latDim and lngDim. But that average is usually different from the center of the map because to the north and south the scale of latitude changes.