What is AABB? A king of preview-collision before the real collision detection (I mean, the accurate one)?
I displayed the debug shapes and set up b2DebugDraw.e_aabbBit flag in order to see it in action. I put a simple box falling, and when the box hit the ground, the AABB frame is completely different.
I think the other respondent already addressed what an AABB is. This response adds to their answer by explaining why the AABB you've drawn could be bigger and of different shape.
The AABB that's being displayed here looks like an AABB calculated for a square that's been dropped onto the green surface. That would at least explain why the AABB is bigger and why the AABB isn't the same shape.
Here's why...
While an AABB can serve as a minimally enclosing axis-aligned bounding box for any shape, Box2D also uses AABBs that encompass some movement and some additional wiggle room for future movement.
Take void b2Fixture::CreateProxies(b2BroadPhase* broadPhase, const b2Transform& xf) for example. Here Box2D calculates AABBs using the shape's ComputeAABB method which basically just calculates the minimal enclosure. This method is called for every "child" shape when first creating a new fixture (where any shape may be composed of 1 or more sub-shapes that are called "child" shapes).
OTOH, take a look at void b2Fixture::Synchronize(b2BroadPhase* broadPhase, const b2Transform& transform1, const b2Transform& transform2). This method is invoked from calling the world step method which calls it for all bodies that may have moved. When it calculates AABBs, it:
computes an AABB for where every child shape has been,
computes an AABB for where every child shape has gone to (in that step),
combines the two AABBs for every child shape into the "proxy" AABB (see void b2AABB::Combine(const b2AABB& aabb1, const b2AABB& aabb2)), and finally
calls broadPhase->MoveProxy(proxy->proxyId, proxy->aabb, displacement) which adds some predictive room to each AABB.
So as to why specifically the AABB drawn could be bigger and of different shape: step 3 often results in an AABB enlarged to hold most of the entire sweep of the child shape, and step 4 results in an AABB that's further extended on all sides by b2_aabbExtension and additionally extended in the direction of travel (by b2_aabbMultiplier * displacement).
Hope this helps!
Well, axis-aligned bounding box is a... simple box :) You didn't give me any fundamentals regarding your project though. Anyway, the bounding box is just a math box, that is able to describe any physical body in three dimensions.
Just try to wrap your body with many boxes and you'll find out that it's sufficient to cover all the physics. It is used in games heavily because of its simplicity. You may do the same things on sphere-models, but with more complicated shapes you get more complicated math, then you get more overload which is too much for current CPU (including gpu).
But totally, there is nothing to explaing, concerning you won't need some maths. It's a simple box that is used to compare collision with another AABB, including help of triangle "normals" and "matrix" transformations.
Related
I have gone through a couple of YOLO tutorials but I am finding it some what hard to figure if the Anchor boxes for each cell the image is to be divided into is predetermined. In one of the guides I went through, The image was divided into 13x13 cells and it stated each cell predicts 5 anchor boxes(bigger than it, ok here's my first problem because it also says it would first detect what object is present in the small cell before the prediction of the boxes).
How can the small cell predict anchor boxes for an object bigger than it. Also it's said that each cell classifies before predicting its anchor boxes how can the small cell classify the right object in it without querying neighbouring cells if only a small part of the object falls within the cell
E.g. say one of the 13 cells contains only the white pocket part of a man wearing a T-shirt how can that cell classify correctly that a man is present without being linked to its neighbouring cells? with a normal CNN when trying to localize a single object I know the bounding box prediction relates to the whole image so at least I can say the network has an idea of what's going on everywhere on the image before deciding where the box should be.
PS: What I currently think of how the YOLO works is basically each cell is assigned predetermined anchor boxes with a classifier at each end before the boxes with the highest scores for each class is then selected but I am sure it doesn't add up somewhere.
UPDATE: Made a mistake with this question, it should have been about how regular bounding boxes were decided rather than anchor/prior boxes. So I am marking #craq's answer as correct because that's how anchor boxes are decided according to the YOLO v2 paper
I think there are two questions here. Firstly, the one in the title, asking where the anchors come from. Secondly, how anchors are assigned to objects. I'll try to answer both.
Anchors are determined by a k-means procedure, looking at all the bounding boxes in your dataset. If you're looking at vehicles, the ones you see from the side will have an aspect ratio of about 2:1 (width = 2*height). The ones viewed from in front will be roughly square, 1:1. If your dataset includes people, the aspect ratio might be 1:3. Foreground objects will be large, background objects will be small. The k-means routine will figure out a selection of anchors that represent your dataset. k=5 for yolov3, but there are different numbers of anchors for each YOLO version.
It's useful to have anchors that represent your dataset, because YOLO learns how to make small adjustments to the anchor boxes in order to create an accurate bounding box for your object. YOLO can learn small adjustments better/easier than large ones.
The assignment problem is trickier. As I understand it, part of the training process is for YOLO to learn which anchors to use for which object. So the "assignment" isn't deterministic like it might be for the Hungarian algorithm. Because of this, in general, multiple anchors will detect each object, and you need to do non-max-suppression afterwards in order to pick the "best" one (i.e. highest confidence).
There are a couple of points that I needed to understand before I came to grips with anchors:
Anchors can be any size, so they can extend beyond the boundaries of
the 13x13 grid cells. They have to be, in order to detect large
objects.
Anchors only enter in the final layers of YOLO. YOLO's neural network makes 13x13x5=845 predictions (assuming a 13x13 grid and 5 anchors). The predictions are interpreted as offsets to anchors from which to calculate a bounding box. (The predictions also include a confidence/objectness score and a class label.)
YOLO's loss function compares each object in the ground truth with one anchor. It picks the anchor (before any offsets) with highest IoU compared to the ground truth. Then the predictions are added as offsets to the anchor. All other anchors are designated as background.
If anchors which have been assigned to objects have high IoU, their loss is small. Anchors which have not been assigned to objects should predict background by setting confidence close to zero. The final loss function is a combination from all anchors. Since YOLO tries to minimise its overall loss function, the anchor closest to ground truth gets trained to recognise the object, and the other anchors get trained to ignore it.
The following pages helped my understanding of YOLO's anchors:
https://medium.com/#vivek.yadav/part-1-generating-anchor-boxes-for-yolo-like-network-for-vehicle-detection-using-kitti-dataset-b2fe033e5807
https://github.com/pjreddie/darknet/issues/568
I think that your statement about the number of predictions of the network could be misleading. Assuming a 13 x 13 grid and 5 anchor boxes the output of the network has, as I understand it, the following shape: 13 x 13 x 5 x (2+2+nbOfClasses)
13 x 13: the grid
x 5: the anchors
x (2+2+nbOfClasses): (x, y)-coordinates of the center of the bounding box (in the coordinate system of each cell), (h, w)-deviation of the bounding box (deviation to the prior anchor boxes) and a softmax activated class vector indicating a probability for each class.
If you want to have more information about the determination of the anchor priors you can take a look at the original paper in the arxiv: https://arxiv.org/pdf/1612.08242.pdf.
So I am making a physics engine that only uses rectangles (axis-aligned bounding boxes) as shapes. I have implemented a method from christer ericsons book that returns the collision time and normal of two moving aabbs. I also have made another method that takes two aabbs velocities, positions and a normal that responds to the collision and give the aabbs new velocities.
The actual problem now is is that I don't know how the loop, that checks the collisions between all aabbs and responds to them, should look like. Simply I don't understand how to order the collisions by the time of impact, and which collision I should respond to.
A loop written in pseudo code that shows how to order all collisions would be really helpful.
Another thing I've mentioned is that it's possible thata moving box could bounce between two static boxes hundreds of times in a single frame if it's velocity is really high, how do you handle that?
You should let a simple priority queue take care of the right order of execution. Conceptually, this would end up looking something like this:
queue<CollisionEvent> q = new empty queue
while (!q.isEmpty) {
nextCollision = q.dequeueMinimum
/* run animation until nextCollision.time
...
*/
newMovingParticles = nextCollision.movingParticles
newCollisions = computeCollisionEvents(newMovingParticles, allOtherParticles);
for each event in newCollisions {
q.enqueue(event, event.time);
}
}
What about boxes that move with large velocities: I don't know. Physically, it would make sense to just accept that it can happen that there is a sequence of very frequent collision events. I can not explain why, but for some reason I do not expect any infinite loops or zeno-type-problems. I would rather expect that even frontal collisions of very heavy boxes with very light boxes, where the light box is trapped between the heavy box and the wall, end in finitely many steps. This is what makes the rigid bodies rigid, I think one should just accept this as a feature.
I've been able to apply a smooth animation to my sprite and control it using the accelerometer. My sprite is fixed to move left and right along the x-aixs.
From here, I need to figure out how to create a vertical infinite wavy line for the sprite to attempt to trace. the aim of my game is for the user to control the sprite's left/right movement with the accelerometer in an attempt to trace the never ending wavy line as best they can, whilst the sprite and camera both move in a vertical direction to simulate "moving along the line." It would be ideal if the line was randomly generated.
I've researched about splines, planes, bezier curves etc, but I can't find anything that seems to relate close enough to what I'm trying to achieve.
I'm just seeking some guidance as to what methods I could possibly use to achieve this. Any ideas?
You could use sum of 4 to 5 sine waves (each with different amplitude, wavelength and phase difference). All 3 of those parameters could be random.
The resulting curve would be very smooth (since it is primarily sinusoidal) yet it'll look random (it's time period would be LCM of all 4 to 5 random wavelengths which is a huge number).
So the curve won't repeat for a long time, yet it will not be hard on memory. Concerning computational complexity, you can always tune it by changing number of sine terms with FPS.
It should look like this.
It's really easy to implement too. (even I could generate above image.. haha)
Hope this helps. Maths rocks. :D
(The basic idea here is a finite Fourier series which I think should be ideal for your use case)
Edit:
You can create each term like this and assign random values to all terms.
public class SineTerm {
private float amplitude;
private float waveLength;
private float phaseDifference;
public SineTerm(float amplitude, float waveLength, float phaseDifference) {
this.amplitude = amplitude;
this.waveLength = waveLength;
this.phaseDifference = phaseDifference;
}
public float evaluate(float x) {
return amplitude * (float) Math.sin(2 * Math.PI * x / waveLength + phaseDifference);
}
}
Now create an array of SineTerms and add all values returned by evaluate(x) (use one coordinate of sprite as input). Use the output as other coordinate of sprite. You should be good to go.
The real trick would be in tuning those random numbers.
Good luck.
I have a single point and a set of shapes. I need to know if the point is contained within the compound shape of those shapes. That is, where all of the shapes intersect.
But that is the easy part.
If the point is outside the compound shape I need to find the position within that compound shape that is closest to the point.
These shapes can be of the type:
square
circle
ring (circle with another circle cut out of the center)
inverse circle (basically just the circular hole and a never ending fill outside that hole, or to the end of the canvas is there must be a limit to its size)
part of circle (as in a pie chart)
part of ring (as above but
line
The example below has an inverted circle (the biggest circle with grey surrounding it), a ring (topleft) a square and a line.
If we don't consider the line, then the orange part is the shape to constrain to. If the line is taken into account then the saturated orange part of the line is the shape to constrain to.
The black small dots represent the points that need to be constrained. The blue dots represent the desired result. (a 1, b 2 etc.)
Point "f" has no corresponding constrained result, since it is already in the orange area.
For the purpose of this example, only point "e" is constrained to the line, all others are constrained to the orange orange area.
If none of the shapes would intersect, then the point cannot be constrained. If the constraint would consist of two lines that cross eachother, then every point would be constrained to the same position (the exact position where the lines cross).
I have found methods that come close to this, but none that I can combine to produce the above functionality.
Some similar questions that I found:
Points within a semi circle
What algorithm can I use to determine points within a semi-circle?
Point closest to MovieClip
Flash: Closest point to MovieClip
Closest point through Minkowski Sum (this will work if I can convert the compound shape to polygons)
http://www.codezealot.org/archives/153
Select edge of polygon closest to point (similar to above)
For a point in an irregular polygon, what is the most efficient way to select the edge closest to the point?
PS: I noticed that the orange area may actually come across as yellow on some screens. It's the colored area in any case.
This isn't much of an answer, but it's a bit too long to fit into a comment ...
It's tempting to think, and therefore to advise you, to find the nearest point in each of the shapes to the point of interest, and to find the nearest of those nearest points.
BUT
The area you are interested in is constructed by union, intersection and difference of other areas and there will, therefore, be no general relationship between the closest points of the original shapes and the closest point of the combined shape. If you understand what I mean. For example, while the closest point of A union B is the closest of the set {closest point of A, closest point of B}, the closest point of A intersection B is not a simple function of that same set; at least not for the general case.
I suggest, therefore, that you are going to have to compute the (complex) shape which represents the area of interest and use one of the algorithms you've already discovered to find the closest point to your point of interest.
I look forward to someone much better versed in computational geometry proving me wrong.
Let's call I the intersection of all the shapes, C the contour of I, p the point you want to constrain and r the result point. We have:
If p is in I, then r = p
If p is not in I, then r is in C. So r is the nearest point in C to p.
So I think what you should do is the following:
If p is inside of all the shapes, return p.
Compute the contour C of the intersection of all the shapes, it is defined by a list of parts (segments, arcs, ...).
Find the nearest point to p in every part of C (computed in 2.) and return the nearest point among them to p.
I've discussed this question at length with my brother, and together we came to conclude that any resulting point will always lie on either the point where two shapes intersect, or where a shape intersects with the line from that shape perpendicular to the original point.
In the case of a circular shape constraint, the perpendicular line equals the line to its center. In the case of a line shape constraint, the perpendicular line is (of course) the line perpendicular to itself. In the case of a rectangle, the perpendicular line is the line perpendicular to the closest edge.
(And the same, theoretically, for complex polygon constraints.)
So a new approach (that I'll have to test still) will be to:
calculate all intersecting (with a shape constraint or with the perpendicular line from the original point to the shape constraint) points
keep only those that are valid: that lie within (comply with) all constraints
select the one closest to the original point
If this works, then one more optimization could be to determine first, which intersecting points are nearest and check if they are valid, and then work outward away from the original point until a valid one is found.
If this does not work, I will have another look at the polygon clipping method. For that approach I've come across this useful post:
Compute union of two arbitrary shapes
where clipping complex polygons is made much easier through http://code.google.com/p/gpcas/
The method holds true for all the cases (all points and their results) above, and also for a number of other scenarios that we tested (on paper).
I will try a live version tomorrow at work.
I'm drawing rectangles at random positions on the stage, and I don't want them to overlap.
So for each rectangle, I need to find a blank area to place it.
I've thought about trying a random position, verify if it is free with
private function containsRect(r:Rectangle):Boolean {
var free:Boolean = true;
for (var i:int = 0; i < numChildren; i++)
free &&= getChildAt(i).getBounds(this).containsRect(r);
return free;
}
and in case it returns false, to try with another random position.
The problem is that if there is no free space, I'll be stuck trying random positions forever.
There is an elegant solution to this?
Let S be the area of the stage. Let A be the area of the smallest rectangle we want to draw. Let N = S/A
One possible deterministic approach:
When you draw a rectangle on an empty stage, this divides the stage into at most 4 regions that can fit your next rectangle. When you draw your next rectangle, one or two regions are split into at most 4 sub-regions (each) that can fit a rectangle, etc. You will never create more than N regions, where S is the area of your stage, and A is the area of your smallest rectangle. Keep a list of regions (unsorted is fine), each represented by its four corner points, and each labeled with its area, and use weighted-by-area reservoir sampling with a reservoir size of 1 to select a region with probability proportional to its area in at most one pass through the list. Then place a rectangle at a random location in that region. (Select a random point from bottom left portion of the region that allows you to draw a rectangle with that point as its bottom left corner without hitting the top or right wall.)
If you are not starting from a blank stage then just build your list of available regions in O(N) (by re-drawing all the existing rectangles on a blank stage in any order, for example) before searching for your first point to draw a new rectangle.
Note: You can change your reservoir size to k to select the next k rectangles all in one step.
Note 2: You could alternatively store available regions in a tree with each edge weight equaling the sum of areas of the regions in the sub-tree over the area of the stage. Then to select a region in O(logN) we recursively select the root with probability area of root region / S, or each subtree with probability edge weight / S. Updating weights when re-balancing the tree will be annoying, though.
Runtime: O(N)
Space: O(N)
One possible randomized approach:
Select a point at random on the stage. If you can draw one or more rectangles that contain the point (not just one that has the point as its bottom left corner), then return a randomly positioned rectangle that contains the point. It is possible to position the rectangle without bias with some subtleties, but I will leave this to you.
At worst there is one space exactly big enough for our rectangle and the rest of the stage is filled. So this approach succeeds with probability > 1/N, or fails with probability < 1-1/N. Repeat N times. We now fail with probability < (1-1/N)^N < 1/e. By fail we mean that there is a space for our rectangle, but we did not find it. By succeed we mean we found a space if one existed. To achieve a reasonable probability of success we repeat either Nlog(N) times for 1/N probability of failure, or N² times for 1/e^N probability of failure.
Summary: Try random points until we find a space, stopping after NlogN (or N²) tries, in which case we can be confident that no space exists.
Runtime: O(NlogN) for high probability of success, O(N²) for very high probability of success
Space: O(1)
You can simplify things with a transformation. If you're looking for a valid place to put your LxH rectangle, you can instead grow all of the previous rectangles L units to the right, and H units down, and then search for a single point that doesn't intersect any of those. This point will be the lower-right corner of a valid place to put your new rectangle.
Next apply a scan-line sweep algorithm to find areas not covered by any rectangle. If you want a uniform distribution, you should choose a random y-coordinate (assuming you sweep down) weighted by free area distribution. Then choose a random x-coordinate uniformly from the open segments in the scan line you've selected.
I'm not sure how elegant this would be, but you could set up a maximum number of attempts. Maybe 100?
Sure you might still have some space available, but you could trigger the "finish" event anyway. It would be like when tween libraries snap an object to the destination point just because it's "close enough".
HTH
One possible check you could make to determine if there was enough space, would be to check how much area the current set of rectangels are taking up. If the amount of area left over is less than the area of the new rectangle then you can immediately give up and bail out. I don't know what information you have available to you, or whether the rectangles are being laid down in a regular pattern but if so you may be able to vary the check to see if there is obviously not enough space available.
This may not be the most appropriate method for you, but it was the first thing that popped into my head!
Assuming you define the dimensions of the rectangle before trying to draw it, I think something like this might work:
Establish a grid of possible centre points across the stage for the candidate rectangle. So for a 6x4 rectangle your first point would be at (3, 2), then (3 + 6 * x, 2 + 4 * y). If you can draw a rectangle between the four adjacent points then a possible space exists.
for (x = 0, x < stage.size / rect.width - 1, x++)
for (y = 0, y < stage.size / rect.height - 1, y++)
if can_draw_rectangle_at([x,y], [x+rect.width, y+rect.height])
return true;
This doesn't tell you where you can draw it (although it should be possible to build a list of the possible drawing areas), just that you can.
I think that the only efficient way to do this with what you have is to maintain a 2D boolean array of open locations. Have the array of sufficient size such that the drawing positions still appear random.
When you draw a new rectangle, zero out the corresponding rectangular piece of the array. Then checking for a free area is constant^H^H^H^H^H^H^H time. Oops, that means a lookup is O(nm) time, where n is the length, m is the width. There must be a range based solution, argh.
Edit2: Apparently the answer is here but in my opinion this might be a bit much to implement on Actionscript, especially if you are not keen on the geometry.
Here's the algorithm I'd use
Put down N number of random points, where N is the number of rectangles you want
iteratively increase the dimensions of rectangles created at each point N until they touch another rectangle.
You can constrain the way that the initial points are put down if you want to have a minimum allowable rectangle size.
If you want all the space covered with rectangles, you can then incrementally add random points to the remaining "free" space until there is no area left uncovered.