Reducing the size (as in area) of the graph generated by graphviz - configuration

Does anyone have any general tips for reducing the size of a graph generated by graphviz (size as in area, not as in file size).
I have a fairly large graph (700 nodes). I set a smaller font size for each node, but it seems to only reduce the font size and not the actual node size. Are there any attributes to reduce the overall amount of blank space in the graph also? Thanks!

In my experience using graphviz to render graphs of that size (~ 700 nodes), minimal trial-and-error adjustment to this combination of attribute settings--some structural, some purely aesthetic--for all three objects (graph, nodes, and edges) should do what you want:
reduce the minimum separation between nodes, via 'nodesep'; e.g., nodes[nodesep=0.75]; this will make your graph being "too compact." (nodesep and ranksep probably affect how dot draws a graph more than any other adjustable parameter)
reduce the minimum distance between nodes of different ranks, e.g, nodes[ranksep=0.75]; 'ranksep' sets the minimum distance between nodes of different ranks--this will affect your graph layout significantly if your graph is comprised of many ranks
increase the edge weights, eg, edge[weight=1.2]; this will make the edges shorter, in turn making the entire graph more compact
remove node borders and node fill, e.g., nodes[color=none; shape=plaintext], especially for oval-shaped nodes, a substantial fraction of the total node space is 'unused' (ie, not used to display the node label); each node's footprint is now reduced to just its text
explicitly set the font size for the nodes (the node borders are enlarged so that they surround the node text, which means that the font size and amount of text for a given node has a significant effect on its size); [fontsize=11] should be large enough to be legible yet also reduce the 'cluttered' appearance (the default size is 14)
use different colors for nodes and edges--this will make your graph easier to read; e.g., set the node 'text' fontcolor to blue and the edge fontcolor to "grey" to help the eye distinguish the two sets of graph structures. This will make a bigger difference than you might think.
explicitly set total graph size, eg, graph[size="7.75,10.25"] (ensures that your graph fits on an 8.5 x 11 page and that it occupies the entire space)

Related

Understanding MaskRCNN Annotation feed

I'm currently working on a Object Detection project using Matterport MaskRCNN.
As part of the job is to detect a Green leaf that crosses a white grid. Until now I have defined the annotation (Polygons) in such a way that every single leaf which crosses the net (and gives white-green-white pattern) is considered a valid annotation.
But, when changing the definition above from single-cross annotation to multi-cross (more than one leaf crossing the net at once), I started to see a serious decrease in model performance during testing phase.
This raised my question - The only difference between the two comes down to size of the annotation. So:
Which of the following is more influential on learning during MaskRCNN's training - pattern or size?
If the pattern is influential, it's better. Because the goal is to identify a crossing. Conversely, if the size of the annotation is the influencer, then that's a problem, because I don't want the model to look for multi-cross or alternatively large single-cross in the image.
P.S. - References to recommended articles that explain the subject will be welcomed
Thanks in advance
If I understand correctly the shape of the annotation becomes longer and more stretched out if going for multicross annotation.
In that case you can change the size and side ratio of the anchors that are scanning the image for objects. With default settings the model often has squarish bounding boxes. This means that very long and narrow annotations create bounding boxes with a great difference between width and height. These objects seem to be harder to segment and detect by the model.
These are the default configurations in the config.py file:
Length of square anchor side in pixels
RPN_ANCHOR_SCALES = (32, 64, 128, 256, 512)
Ratios of anchors at each cell (width/height). A value of 1 represents a square anchor, and 0.5 is a wide anchor
RPN_ANCHOR_RATIOS = [0.5, 1, 2]
You can play around with these values in inference mode and look if it gives you some better results.

Anchor Boxes in YOLO : How are they decided

I have gone through a couple of YOLO tutorials but I am finding it some what hard to figure if the Anchor boxes for each cell the image is to be divided into is predetermined. In one of the guides I went through, The image was divided into 13x13 cells and it stated each cell predicts 5 anchor boxes(bigger than it, ok here's my first problem because it also says it would first detect what object is present in the small cell before the prediction of the boxes).
How can the small cell predict anchor boxes for an object bigger than it. Also it's said that each cell classifies before predicting its anchor boxes how can the small cell classify the right object in it without querying neighbouring cells if only a small part of the object falls within the cell
E.g. say one of the 13 cells contains only the white pocket part of a man wearing a T-shirt how can that cell classify correctly that a man is present without being linked to its neighbouring cells? with a normal CNN when trying to localize a single object I know the bounding box prediction relates to the whole image so at least I can say the network has an idea of what's going on everywhere on the image before deciding where the box should be.
PS: What I currently think of how the YOLO works is basically each cell is assigned predetermined anchor boxes with a classifier at each end before the boxes with the highest scores for each class is then selected but I am sure it doesn't add up somewhere.
UPDATE: Made a mistake with this question, it should have been about how regular bounding boxes were decided rather than anchor/prior boxes. So I am marking #craq's answer as correct because that's how anchor boxes are decided according to the YOLO v2 paper
I think there are two questions here. Firstly, the one in the title, asking where the anchors come from. Secondly, how anchors are assigned to objects. I'll try to answer both.
Anchors are determined by a k-means procedure, looking at all the bounding boxes in your dataset. If you're looking at vehicles, the ones you see from the side will have an aspect ratio of about 2:1 (width = 2*height). The ones viewed from in front will be roughly square, 1:1. If your dataset includes people, the aspect ratio might be 1:3. Foreground objects will be large, background objects will be small. The k-means routine will figure out a selection of anchors that represent your dataset. k=5 for yolov3, but there are different numbers of anchors for each YOLO version.
It's useful to have anchors that represent your dataset, because YOLO learns how to make small adjustments to the anchor boxes in order to create an accurate bounding box for your object. YOLO can learn small adjustments better/easier than large ones.
The assignment problem is trickier. As I understand it, part of the training process is for YOLO to learn which anchors to use for which object. So the "assignment" isn't deterministic like it might be for the Hungarian algorithm. Because of this, in general, multiple anchors will detect each object, and you need to do non-max-suppression afterwards in order to pick the "best" one (i.e. highest confidence).
There are a couple of points that I needed to understand before I came to grips with anchors:
Anchors can be any size, so they can extend beyond the boundaries of
the 13x13 grid cells. They have to be, in order to detect large
objects.
Anchors only enter in the final layers of YOLO. YOLO's neural network makes 13x13x5=845 predictions (assuming a 13x13 grid and 5 anchors). The predictions are interpreted as offsets to anchors from which to calculate a bounding box. (The predictions also include a confidence/objectness score and a class label.)
YOLO's loss function compares each object in the ground truth with one anchor. It picks the anchor (before any offsets) with highest IoU compared to the ground truth. Then the predictions are added as offsets to the anchor. All other anchors are designated as background.
If anchors which have been assigned to objects have high IoU, their loss is small. Anchors which have not been assigned to objects should predict background by setting confidence close to zero. The final loss function is a combination from all anchors. Since YOLO tries to minimise its overall loss function, the anchor closest to ground truth gets trained to recognise the object, and the other anchors get trained to ignore it.
The following pages helped my understanding of YOLO's anchors:
https://medium.com/#vivek.yadav/part-1-generating-anchor-boxes-for-yolo-like-network-for-vehicle-detection-using-kitti-dataset-b2fe033e5807
https://github.com/pjreddie/darknet/issues/568
I think that your statement about the number of predictions of the network could be misleading. Assuming a 13 x 13 grid and 5 anchor boxes the output of the network has, as I understand it, the following shape: 13 x 13 x 5 x (2+2+nbOfClasses)
13 x 13: the grid
x 5: the anchors
x (2+2+nbOfClasses): (x, y)-coordinates of the center of the bounding box (in the coordinate system of each cell), (h, w)-deviation of the bounding box (deviation to the prior anchor boxes) and a softmax activated class vector indicating a probability for each class.
If you want to have more information about the determination of the anchor priors you can take a look at the original paper in the arxiv: https://arxiv.org/pdf/1612.08242.pdf.

Preferred value to encode 96 DPI within PNG

PNG files may contain chunks of optional informations. One of these optional information blocks is the physical resolution of the image (chunk-signature pHYs).[1] [2] It contains separate values for horizontal and vertical resolution as pixels per unit, and a unit specifier, that can be 0 for unit unspecified, or 1 for meter ← that's quite confusing, because resolutions are traditionally expressed in DPIs.
The Inch is defined as 25.4 mm in the metric system.
So, if I calculate this correctly, 96 DPIs means 3779.527559... dots per metre. For the pHYs chunk, this has to be rounded. I'd say 3780 is the right value, but I found also 3779 suggested on the web. Images of both kind also coexist on my machine.
The difference may not be important in most cases,
3779 * 0.054 = 95.9866
3780 * 0.054 = 96.012
but I try to avoid tricky layout problems when mixing images of both kind in processes that are DPI-aware like creating PDF files using LaTeX.
[1] Portable Network Graphics (PNG) Specification (Second Edition), section11.3.5.3 pHYs Physical pixel dimensions
[2] PNG Specification: Chunk Specifications, section 4.2.4.2. pHYs Physical pixel dimensions
The relative difference is less that 0.03% (2.65/10000), it's hardly relevant.
Anyway, I'd go with 3780. Not only it's the nearest value, but it would give the correct value if some (sloppy) conversor rounds the value down (instead of rounding to the nearest).
Also, if you google "72.009 DPI PNG" you'll see a similar (non) issue with 72 DPI (example), and it seems that most people rounded the value up (which is also the nearest) 2834.645 -> 2835

Is there any difference between user units and pixels?

I've been reading several articles about SVG that make a clear distinction between using and not using units (this last case even has a name of its own), e.g.
<!-- the viewport will be 800px by 600px -->
<svg width="800" height="600">
<!-- SVG content drawn onto the SVG canvas -->
</svg>
In SVG, values can be set with or without a unit identifier. A
unitless value is said to be specified in user space using user units.
If a value is specified in user units, then the value is assumed to be
equivalent to the same number of “px” units. This means that the
viewport in the above example will be rendered as a 800px by 600px
viewport.
You can also specify values using units. The supported length unit
identifiers in SVG are: em, ex, px, pt, pc, cm, mm, in, and
percentages.
source
Is there any actual difference between omiting the unit and setting it to px?
Can I just set e.g. mm everywhere to avoid ambiguity, or I'll eventually be getting different results?
<svg width="800mm" height="600mm">
Disclaimer: what follows is pure guessing (I only learnt the basics of SVG last week) but I'm sharing it because I believe it could help others with my same doubts and I hope it doesn't contain serious errors.
The SVG canvas is basically a mental concept—a infinite plane where you use Cartesian coordinates to place stuff and move around. It isn't too different from stroking shapes in a sheet of graph paper where you've drawn a cross to identify an arbitrary point as coordinate origin, except that notebooks are not infinite. In the same way that tou draw a 3-square radius circle in the sheet and you don't care that those squares represent 12 mm, you draw shapes in your SVG canvas using unitless dimensions because it doesn't really matter what exact physical size they represent. The SVG spec uses the term "user units" to express this idea.
Using actual units only makes sense in two situations:
When our virtual user units need to interact with real world, e.g., the canvas is to be printed in a computer monitor.
When we want an element in our graph to be defined in such a way that it doesn't scale, neither up nor down, e.g. a stroke around a letter that needs to look identical no matter how we resize the logo it belongs to.
It's in this situation, more specifically #1, when the px equivalence comes in handy. When we need to render the graph or make calculations what involve actual units, unitless dimensions are interpreted as pixels. We can think of it as a default because we can render the canvas any size and, in any case, pixels are no longer physical pixels in these days of high-res displays and builtin zoom.
And, for all this, it's probably better to just omit units in your SVG code. Adding them in a general basis only makes code unnecessarily verbose.

How to expand to a normal vessel with ITK when I have a skeleton line and every radius for pixels?

I did an thinning operation on vessels, and now I'm trying to reconstruct it.
How to expand them to normal vessels in ITK when I have a skeleton line and radius values for each pixel?
DISCLAIMER: This could be slow, but since no other answer has been suggested, here you go.
Since your question does not indicate this, I'm assuming that you're talking about a 2D image, but the following approach can be extended for 3D too. This is how I'd go about it:
Create a blank image with zero filled pixel values
Create multiple instances of disk/sphere ShapedNeighborhoodIterator each having a different radius on the blank image (choose the most common radii from the vessel width histogram).
Visit each pixel in the binary skeleton image. When you come upon a white (vessel skeleton) pixel, recollect the vessel radius at that pixel.
If you already have a ShapedNeighborhoodIterator for that radius value, take the iterator to the pixel location in the blank image and fill up a disk/sphere of white pixels centered about that pixel. If you don't have a ShapedNeighborhoodIterator for that radius value, create one and do the same operation.
Once you finish iterating over the skeletonized image, you will have a reconstructed tree in the other image. Note that step 2 is optional, but will help you achieve faster computation.