Why is nodeA smaller than nodeB in GraphHopper data layout and how the edge direction is represented? - graphhopper

The document https://github.com/graphhopper/graphhopper/blob/master/docs/core/technical.md states that "nodeA is always smaller than nodeB" related to GraphHopper data layout. Which are the benefits of implementing it that way? How the edge direction is represented in data layout?

This is just a convention.
The direction can be different and depends on how you traverse the graph because for the bidirectional algorithms you need to access every edge from both sides, even if it is a directed edge. E.g. if you have node Y and X, you can either do edgeIterator=edgeExplorer.setBaseNode(X) or setBaseNode(Y). And depending on the returned flags (edgeIterator.getFlags) you can find out the accesibility for every stored vehicles.

Related

calculate distance between camera and different sized objects

I have been trying to develop a small object detection system for my college project.
The main idea is that i have a robot , that can pick one particular "object" from the surroundings, for this purpose i am using only a single camera, with known intrinsic parameters.
I have already developed an object detection system, which can predict bounding box coordinates,
using these coordinates and size of bounding boxes, i am able to predict perceived depth, using "Triangle similarity" method,
The problem that i am facing is , this particular "object" can vary in size, which means the objects located at the same distance can also have different sized bounding boxes.
What could be the other way to detect rough estimate from camera to object, given an object doesn't have a fixed size.
Cannot be done in general, since scale information is lost in camera projection.
Depending on your particular case, you may be able to use more indirect methods to infer distance. For example, if the subject rests on a ground plane, you may be able to exploit knowledge of the shape and size of patterns on that floor. More sophisticated methods were analyzed many years ago - the general subject goes under the heading of "single-view metrology". A good reference is Antonio Criminisi's 1999 PhD thesis.
As suggested above, you can not get the absolute depth of objects from monocular camera (single view).
I would suggest to try out following approaches:
Use some reference scale attached to each object eg. you can add and detect ArUco marker on each object and find the corresponding object's orientation and depth.
Above approach might not be feasible if you have unkonwn number of objects, you can use deep learning based models for monocular depth estimation

igraph: layout by node attribute

Is there a way to weigh layout according to node attributes in igraph? In other words, how to get nodes that share the same characteristics (but do not have edges between them) cluster more closely together?
While many layout functions can take edge weight into account, the nodes that I want to be closer to each other do not have edges between them. An example of such situation is if the graph is bipartite. Using layouts such as fruchterman.reingold is not very informative as vertices of the two different types are interspersed. However, I do not want it be be as extreme as the layout.bipartite option either as it would be rather messy when there are lots of vertices. What I wish is to have a layout that is somewhere between these two, having vertices of the same type to be on one side, and also cluster according to certain attributes, with edges between the two types.
Any idea or suggestion will be greatly appreciated. Thanks!
igraph layouts are simply matrices with 2 columns and N rows, so you can easily re-use one layout with another graph as long as the two graphs share the same number of nodes. You can make use of this here: create a graph where you connect the nodes that you want to be placed close to each other, calculate the layout using this graph, and then plot your original graph with the layout you have calculated.

Randomly Generate Directed Graph on a grid

I am trying to randomly generate a directed graph for the purpose of making a puzzle game similar to the ice sliding puzzles from pokemon.
This is essentially what I want to be able to randomly generate: http://bulbanews.bulbagarden.net/wiki/Crunching_the_numbers:_Graph_theory
I need to be able to limit the size of the graph in an x and y dimension. In the example in the link, it would be restricted to an 8x4 grid.
The problem I am running in to is not randomly generating the graph, but randomly generating a graph which I can properly map out in a 2d space, since I need something (like a rock) on the opposite side of a node, to make it visually make sense when you stop sliding. The problem with this is sometimes the rock ends up in the path between two other nodes or possibly on another node itself, which causes the entire graph to become broken.
After discussing the problem with a few people I know, we came to a couple of conclusions that may lead to a solution. Including the obstacles in the grid as part of the graph when constructing it. Start out with a fully filled grid and just draw a random path and delete out blocks that will make that path work, though the problem then becomes figuring out which ones to delete so that you don't accidentally introduce an additional, shorter path. We were also thinking a dynamic programming algorithm may be beneficial, though none of us are too skilled with creating dynamic programming algorithms from nothing. Any ideas or references about what this problem is officially called (if it's an official graph problem) would be most helpful.
I wouldn't look at it as a graph problem, since as you say the representation is incomplete. To generate a puzzle I would work directly on a grid, and work backwards; first fix the destination spot, then place rocks in some way to reach it from one or more spots, and iteratively add stones to reach those other spots, with the constraint that you never add a stone which breaks all the paths to the destination.
You might want to generate a planar graph, which means that the edges of the graph will not overlap each other in a two dimensional space. Another definition of planar graphs ist that each planar graph does not have any subgraphs of the type K_3,3 (complete bi-partite with six nodes) or K_5 (complete graph with five nodes).
There's a paper on the fast generation of planar graphs.

document image processing

I working on an application for processing document images (mainly invoices) and basically, I'd like to convert certain regions of interest into an XML-structure and then classify the document based on that data. Currently I am using ImageJ for analyzing the document image and Asprise/tesseract for OCR.
Now I am looking for something to make developing easier. Specifically, I am looking for something to automatically deskew a document image and analyze the document structure (e.g. converting an image into a quadtree structure for easier processing). Although I prefer Java and ImageJ I am interested in any libraries/code/papers regardless of the programming language it's written in.
While the system I am working on should as far as possible process data automatically, the user should oversee the results and, if necessary, correct the classification suggested by the system. Therefore I am interested in using machine learning techniques to achieve more reliable results. When similar documents are processed, e.g. invoices of a specific company, its structure is usually the same. When the user has previously corrected data of documents from a company, these corrections should be considered in the future. I have only limited knowledge of machine learning techniques and would like to know how I could realize my idea.
The following prototype in Mathematica finds the coordinates of blocks of text and performs OCR within each block. You may need to adapt the parameters values to fit the dimensions of your actual images. I do not address the machine learning part of the question; perhaps you would not even need it for this application.
Import the picture, create a binary mask for the printed parts, and enlarge these parts using an horizontal closing (dilation and erosion).
Query for each blob's orientation, cluster the orientations, and determine the overall rotation by averaging the orientations of the largest cluster.
Use the previous angle to straighten the image. At this time OCR is possible, but you would lose the spatial information for the blocks of text, which will make the post-processing much more difficult than it needs to be. Instead, find blobs of text by horizontal closing.
For each connected component, query for the bounding box position and the centroid position. Use the bounding box positions to extract the corresponding image patch and perform OCR on the patch.
At this point, you have a list of strings and their spatial positions. That's not XML yet, but it sounds like a good starting point to be tailored straightforwardly to your needs.
This is the code. Again, the parameters (structuring elements) of the morphological functions may need to change, based on the scale of your actual images; also, if the invoice is too tilted, you may need to "rotate" roughly the structuring elements in order to still achieve good "un-skewing."
img = ColorConvert[Import#"http://www.team-bhp.com/forum/attachments/test-drives-initial-ownership-reports/490952d1296308008-laura-tsi-initial-ownership-experience-img023.jpg", "Grayscale"];
b = ColorNegate#Binarize[img];
mask = Closing[b, BoxMatrix[{2, 20}]]
orientations = ComponentMeasurements[mask, "Orientation"];
angles = FindClusters#orientations[[All, 2]]
\[Theta] = Mean[angles[[1]]]
straight = ColorNegate#Binarize[ImageRotate[img, \[Pi] - \[Theta], Background -> 1]]
TextRecognize[straight]
boxes = Closing[straight, BoxMatrix[{1, 20}]]
comp = MorphologicalComponents[boxes];
measurements = ComponentMeasurements[{comp, straight}, {"BoundingBox", "Centroid"}];
texts = TextRecognize#ImageTrim[straight, #] & /# measurements[[All, 2, 1]];
Cases[Thread[measurements[[All, 2, 2]] -> texts], (_ -> t_) /; StringLength[t] > 0] // TableForm
The paper we use for skew angle detection is: Skew detection and text line position determination in digitized documents by Gatos et. al. The only limitation with this paper is that it can detect skew upto -5 and +5 degrees. After that, we need something to slap the user with a message! :)
In your case, where there are primarily invoice scans, you may beautifully use: Multiresolution Analysis in Extraction of Reference Lines from Documents with Gray Level Background by Tag et. al.
We wrote the code in MATLAB, if you need help let me know!
I worked on a similar project once, and for being a long time user of OpenCV I ended up using it once again. OpenCV is a popular-cross-platform-computer-vision-library that offers programming interfaces for C and C++.
I found an interesting blog that had a post on how to detect the skew angle of a text using OpenCV, and then another on how to deskew.
To retrieve the text of the document and be able to pass a smaller image to tesseract, I suggest taking a look at the bounding box technique.
I don't know if the image acquisition procedure is your responsibility, but if it is you might want to take a look at how to do camera calibration with OpenCV to fix the distortion in the image caused by some camera lenses.

Effective data structure for overlapping spatial areas

I'm writing a game where a large number of objects will have "area effects" over a region of a tiled 2D map.
Required features:
Several of these area effects may overlap and affect the same tile
It must be possible to very efficiently access the list of effects for any given tile
The area effects can have arbitrary shapes but will usually be of the form "up to X tiles distance from the object causing the effect" where X is a small integer, typically 1-10
The area effects will change frequently, e.g. as objects are moved to different locations on the map
Maps could be potentially large (e.g. 1000*1000 tiles)
What data structure would work best for this?
Providing you really do have a lot of area effects happening simultaneously, and that they will have arbitrary shapes, I'd do it this way:
when a new effect is created, it is
stored in a global list of effects
(not necessarily a global variable,
just something that applies to the
whole game or the current game-map)
it calculates which tiles
it affects, and stores a list of those tiles against the effect
each of those tiles is
notified of the new effect, and
stores a reference back to it in a
per-tile list (in C++ I'd use a
std::vector for this, something with
contiguous storage, not a linked
list)
ending an effect is handled by iterating through
the interested tiles and removing references to it, before destroying it
moving it, or changing its shape, is handled by removing
the references as above, performing the change calculations,
then re-attaching references in the tiles now affected
you should also have a debug-only invariant check that iterates through
your entire map and verifies that the list of tiles in the effect
exactly matches the tiles in the map that reference it.
Usually it depends on density of your map.
If you know that every tile (or major part of tiles) contains at least one effect you should use regular grid – simple 2D array of tiles.
If your map is feebly filled and there are a lot of empty tiles it make sense to use some spatial indexes like quad-tree or R-tree or BSP-trees.
Usually BSP-Trees (or quadtrees or octrees).
Some brute force solutions that don't rely on fancy computer science:
1000 x 1000 isn't too large - just a meg. Computers have Gigs. You could have an 2d array. Each bit in the bytes could be a 'type of area'. The 'effected area' that's bigger could be another bit. If you have a reasonable amount of different types of areas you can still use a multi-byte bit mask. If that gets ridiculous you can make the array elements pointers to lists of overlapping area type objects. But then you lose efficiency.
You could also implement a sparse array - using a hashtable key'd off of the coords (e.g., key = 1000*x+y) - but this is many times slower.
If course if you don't mind coding the fancy computer science ways, they usually work much better!
If you have a known maximum range of each area effect, you could use a data structure of your choosing and store the actual sources, only, that's optimized for normal 2D Collision Testing.
Then, when checking for effects on a tile, simply check (collision detection style, optimized for your data structure) for all effect sources within the maximum range and then applying a defined test function (for example, if the area is a circle, check if the distance is less than a constant; if it's a square, check if the x and y distances are each within a constant).
If you have a small (<10) amount of effect "field" shapes, you can even do a unique collision detection for each effect field type, within their pre-computed maximum range.