Robotics: distance in configuration space - configuration

In the configuration space, let us say there is a configuration c_1 and a target configuration c_t, is there a way to determine "how far away" c_1 is from c_t? I assume "how far away" can be measures by number of pose changes. Is there a mathematical way of defining this distance between two configurations?

The distance between configuration vectors q_a and q_b in configuration space (C-space) is the norm of the difference between the two vectors (as an aside, configurations are normally denoted q). This is true for any n-dimensional space. The equation for this is:
To explain a bit more - as I'm sure you know, the distance between point a and b in 2D Euclidean space is given by the square root of the summation of the x and y distance squared:
d_x = a_x - b_x
d_y = a_y - b_y
distance = sqrt(d_x^2 + d_y^2)
To generalise this to n dimensions: the distance between two points a and b in an n-dimensional space is given by:

Related

Questions about convolution (in CNN)

I suddenly came up with a question about convolution and just wanted to be clear if I'm missing something. The question is whether if the two operations below are identical.
Case1)
Suppose we have a feature map C^2 x H x W. And, we have a K x K x C^2 Conv weight with stride S. (To be clear, C^2 is the channel dimension but just wanted to make it as a square number, K is the kernel size).
Case2)
Suppose we have a feature map 1 x CH x CW. And, we have a CK x CK x 1 Conv weight with stride CS.
So, basically Case2 is a pixel-upshuffled version of case1 (both feature-map and Conv weight.) As convolutions are simply element-wise multiplication, both operations seem identical to me.
# given a feature map and a conv_weight, namely f_map, conv_weight
#case1)
convLayer = Conv(conv_weight)
result = convLayer(f_map, stride=1)
#case2)
f_map = pixelshuffle(f_map, scale=C)
conv_weight = pixelshuffle(f_map, scale=C)
result = convLayer(f_map, stride=C)
But this means that, (for example) given a 256xHxW feature-map with a 3x3 Conv (as in many deep learning models), performing a convolution was simply doing a HUUUGE 48x48 Conv to a 1 x 16*H x 16*W Feature map.
But this doesn't meet my basic intuition of CNNs, stacking multiple of layers with the smallest 3x3 Conv, resulting in somewhat large receptive field, and each channel having different (possibly redundant) information.
You can, in a sense, think of "folding" spatial information into the channel dimension. This is the rationale behind ResNet's trade-off between spatial resolution and feature dimension. In the ResNet case whenever they sample x2 in space they increase feature space x2. However, since you have two spatial dimensions and you sample x2 in both you effectively reduce the "volume" of the feature map by x0.5.

Topojson: quantization VS simplification

What is the difference between quantization and simplification?
Is quantization another way of doing simplification?
Is it better to use quantization in certain situations?
Or should i be using a combination of both?
The total size of your geometry is controlled by two factors: the number of points and the number of digits (the precision) of each coordinate.
Say you have a large geometry with 1,000,000 points, where each two-dimensional point is represented as longitude in ±180° and latitude in ±90°:
[-90.07231180399987,29.501753271000098],[-90.06635619599979,29.499494248000133],…
Real numbers can have arbitrary precision (in JSON; in JavaScript they are limited by the precision of IEEE 754) and thus an infinite number of digits. But in practice the above is pretty typical, so say each coordinate has 18 digits. Including extra symbols ([, ] and ,), each point takes at most 1 + 18 + 1 + 18 + 1 = 39 bytes to encode in JSON, and the entire geometry is about 39 * 1,000,000 ≈ 39MB.
Now say we convert these real numbers to integers: both longitude and latitude are reduced to integers x and y where 0 ≤ x ≤ 99 and 0 ≤ y ≤ 99. A simple mapping between real-number points ⟨λ,φ⟩ and integer coordinates ⟨x,y⟩ is:
x = floor((λ + 180) / 360 * 100);
y = floor((φ + 90) / 180 * 100);
Since each coordinate now takes at most 2 digits to encode, each point takes at most 1 + 2 + 1 + 2 + 1 = 7 bytes to encode in JSON, and the entire geometry is about 7MB; we reduced the total size by 82%.
Of course, nothing comes for free: if you remove too much information, you will no longer be able to display the geometry accurately. The rule of thumb is that the size of your grid should be at least twice as big as the largest expected display size for the entire map. For example, if you’re displaying a world map in a 960×500 pixel space, then the default 10,000×10,000 (-q 1e4) is a reasonable choice.
So, quantization removes information by reducing the precision of each coordinate, effectively snapping each point to a regular grid. This reduces the size of the generated TopoJSON file because each coordinate is represented as an integer (such as between 0 and 9,999) with fewer digits.
In contrast, simplification removes information by removing points, applying a heuristic that tries to measure the visual salience of each point and removing the least-noticeable points. There are many different methods of simplification, but the Visvalingam method used by the TopoJSON reference implementation is described in my Line Simplification article so I won’t repeat myself here.
While quantization and simplification address these two different types of information mostly independently, there’s an additional complication: quantization is applied before the topology is constructed, whereas simplification is necessarily applied after to preserve the topology. Since quantization frequently introduces coincident points ([24,62],[24,62],[24,62]…), and coincident points are removed, quantization can also remove points.
The reason that quantization is applied before the topology is constructed is that geometric inputs are often not topologically valid. For example, if you takes a shapefile of Nevada counties and combine it with a shapefile of Nevada’s state border, the coordinates in one shapefile might not exactly match the coordinates in the other shapefile. By quantizing the coordinates before constructing the topology, you snap the coordinates to a regular grid and can get a cleaner topology with fewer arcs, hopefully correctly identifying all shared arcs. (Of course, if you over-quantize, then you can cause too many coincident points and get self-intersecting arcs, which causes other problems.)
In a future release, maybe 1.5.0, TopoJSON will allow you to control the quantization before the topology is constructed independently from the quantization of the output TopoJSON file. Thus, you could use a finer grid (or no grid at all!) to compute the topology, then simplify, then use a coarser grid appropriate for a low-resolution screen display. For now, these are tied together, so I recommend using a finer grid (e.g., -q 1e6) that produces a clean topology, at the expense of a slightly larger file. Since TopoJSON also uses delta-encoded coordinates, you rarely pay the full price for all the digits anyway!
The two are related, but have different purposes and results.
I believe quantization collapses nearby points based on the parameter (which you tune to the expected resolution of the view) - no point in having a resolution higher than the pixels that will be drawing the map. But it doesn't go out of the way to analyze the path to determine the optimal number of points needed to represent the shape.
Simplification is an algorithm that will analyze the polygon and reduce the number of points in an optimal manner such that the overall deformation of the polygon is minimized. Basically, it can be used to dramamatically reduce the number of points (and thus file size) without noticeable impact to the quality of the path.
As a parallel case study, consider a straight line made up of 10 points. Quantization will reduce the number of points (collapsing nearby or coincident points) based on the value you use. Simplification will analyze the line and realize that 8 out of the ten points can be removed without significantly changing the polygon's overall shape, and reduce the line to two points (because there is no deformation of the path by removing points on a line).
See also:
Topojson reference: https://github.com/mbostock/topojson/wiki/Command-Line-Reference
M. Bostock's Simplification article: http://bost.ocks.org/mike/simplify/
Both should be used in combination: quatization to reduce the map to a right sized grid, simplification to optimize the paths.

any rules of thumb how to smooth FFT spectrum to prevent artifacts when hand-tweaking?

I've got a FFT magnitude spectrum and I want to create a filter from it that selectively passes periodic noise sources (e.g. sinewave spurs) and zero's out the frequency bins associated with the random background noise. I understand sharp transitions in the freq domain will create ringing artifacts once this filter is IFFT back to the time domain... and so I'm wondering if there are any rules of thumb how to smooth the transitions in such a filter to avoid such ringing.
For example, if the FFT has 1M frequency bins, and there are five spurs poking out of the background noise floor, I'd like to zero all bins except the peak bin associated with each of the five spurs. The question is how to handle the neighboring spur bins to prevent artifacts in the time domain. For example, should the the bin on each side of a spur bin be set to 50% amplitude? Should two bins on either side of a spur bin be used (the closest one at 50%, and the next closest at 25%, etc.)? Any thoughts greatly appreciated. Thanks!
I like the following method:
Create the ideal magnitude spectrum (remembering to make it symmetrical about DC)
Inverse transform to the time domain
Rotate the block by half the blocksize
Apply a Hann window
I find it creates reasonably smooth frequency domain results, although I've never tried it on something as sharp as you're suggesting. You can probably make a sharper filter by using a Kaiser-Bessel window, but you have to pick the parameters appropriately. By sharper, I'm guessing maybe you can reduce the sidelobes by 6 dB or so.
Here's some sample Matlab/Octave code. To test the results, I used freqz(h, 1, length(h)*10);.
function [ht, htrot, htwin] = ArbBandPass(N, freqs)
%# N = desired filter length
%# freqs = array of frequencies, normalized by pi, to turn into passbands
%# returns raw, rotated, and rotated+windowed coeffs in time domain
if any(freqs >= 1) || any(freqs <= 0)
error('0 < passband frequency < 1.0 required to fit within (DC,pi)')
end
hf = zeros(N,1); %# magnitude spectrum from DC to 2*pi is intialized to 0
%# In Matlabs FFT, idx 1 -> DC, idx 2 -> bin 1, idx N/2 -> Fs/2 - 1, idx N/2 + 1 -> Fs/2, idx N -> bin -1
idxs = round(freqs * N/2)+1; %# indeces of passband freqs between DC and pi
hf(idxs) = 1; %# set desired positive frequencies to 1
hf(N - (idxs-2)) = 1; %# make sure 2-sided spectrum is symmetric, guarantees real filter coeffs in time domain
ht = ifft(hf); %# this will have a small imaginary part due to numerical error
if any(abs(imag(ht)) > 2*eps(max(abs(real(ht)))))
warning('Imaginary part of time domain signal surprisingly large - is the spectrum symmetric?')
end
ht = real(ht); %# discard tiny imag part from numerical error
htrot = [ht((N/2 + 1):end) ; ht(1:(N/2))]; %# circularly rotate time domain block by N/2 points
win = hann(N, 'periodic'); %# might want to use a window with a flatter mainlobe
htwin = htrot .* win;
htwin = htwin .* (N/sum(win)); %# normalize peak amplitude by compensating for width of window lineshape

Good way to procedurally generate a "blob" graphic in 2D

I'm looking to create a "blob" in a computationally fast manner. A blob here is defined as a collection of pixels that could be any shape, but all connected. Examples:
.ooo....
..oooo..
....oo..
.oooooo.
..o..o..
...ooooooooooooooooooo...
..........oooo.......oo..
.....ooooooo..........o..
.....oo..................
......ooooooo....
...ooooooooooo...
..oooooooooooooo.
..ooooooooooooooo
..oooooooooooo...
...ooooooo.......
....oooooooo.....
.....ooooo.......
.......oo........
Where . is dead space and o is a marked pixel. I only care about "binary" generation - a pixel is either ON or OFF. So for instance these would look like some imaginary blob of ketchup or fictional bacterium or whatever organic substance.
What kind of algorithm could achieve this? I'm really at a loss
David Thonley's comment is right on, but I'm going to assume you want a blob with an 'organic' shape and smooth edges. For that you can use metaballs. Metaballs is a power function that works on a scalar field. Scalar fields can be rendered efficiently with the marching cubes algorithm. Different shapes can be made by changing the number of balls, their positions and their radius.
See here for an introduction to 2D metaballs: https://web.archive.org/web/20161018194403/https://www.niksula.hut.fi/~hkankaan/Homepages/metaballs.html
And here for an introduction to the marching cubes algorithm: https://web.archive.org/web/20120329000652/http://local.wasp.uwa.edu.au/~pbourke/geometry/polygonise/
Note that the 256 combinations for the intersections in 3D is only 16 combinations in 2D. It's very easy to implement.
EDIT:
I hacked together a quick example with a GLSL shader. Here is the result by using 50 blobs, with the energy function from hkankaan's homepage.
Here is the actual GLSL code, though I evaluate this per-fragment. I'm not using the marching cubes algorithm. You need to render a full-screen quad for it to work (two triangles). The vec3 uniform array is simply the 2D positions and radiuses of the individual blobs passed with glUniform3fv.
/* Trivial bare-bone vertex shader */
#version 150
in vec2 vertex;
void main()
{
gl_Position = vec4(vertex.x, vertex.y, 0.0, 1.0);
}
/* Fragment shader */
#version 150
#define NUM_BALLS 50
out vec4 color_out;
uniform vec3 balls[NUM_BALLS]; //.xy is position .z is radius
bool energyField(in vec2 p, in float gooeyness, in float iso)
{
float en = 0.0;
bool result = false;
for(int i=0; i<NUM_BALLS; ++i)
{
float radius = balls[i].z;
float denom = max(0.0001, pow(length(vec2(balls[i].xy - p)), gooeyness));
en += (radius / denom);
}
if(en > iso)
result = true;
return result;
}
void main()
{
bool outside;
/* gl_FragCoord.xy is in screen space / fragment coordinates */
outside = energyField(gl_FragCoord.xy, 1.0, 40.0);
if(outside == true)
color_out = vec4(1.0, 0.0, 0.0, 1.0);
else
discard;
}
Here's an approach where we first generate a piecewise-affine potato, and then smooth it by interpolating. The interpolation idea is based on taking the DFT, then leaving the low frequencies as they are, padding with zeros at high frequencies, and taking an inverse DFT.
Here's code requiring only standard Python libraries:
import cmath
from math import atan2
from random import random
def convexHull(pts): #Graham's scan.
xleftmost, yleftmost = min(pts)
by_theta = [(atan2(x-xleftmost, y-yleftmost), x, y) for x, y in pts]
by_theta.sort()
as_complex = [complex(x, y) for _, x, y in by_theta]
chull = as_complex[:2]
for pt in as_complex[2:]:
#Perp product.
while ((pt - chull[-1]).conjugate() * (chull[-1] - chull[-2])).imag < 0:
chull.pop()
chull.append(pt)
return [(pt.real, pt.imag) for pt in chull]
def dft(xs):
pi = 3.14
return [sum(x * cmath.exp(2j*pi*i*k/len(xs))
for i, x in enumerate(xs))
for k in range(len(xs))]
def interpolateSmoothly(xs, N):
"""For each point, add N points."""
fs = dft(xs)
half = (len(xs) + 1) // 2
fs2 = fs[:half] + [0]*(len(fs)*N) + fs[half:]
return [x.real / len(xs) for x in dft(fs2)[::-1]]
pts = convexHull([(random(), random()) for _ in range(10)])
xs, ys = [interpolateSmoothly(zs, 100) for zs in zip(*pts)] #Unzip.
This generates something like this (the initial points, and the interpolation):
Here's another attempt:
pts = [(random() + 0.8) * cmath.exp(2j*pi*i/7) for i in range(7)]
pts = convexHull([(pt.real, pt.imag ) for pt in pts])
xs, ys = [interpolateSmoothly(zs, 30) for zs in zip(*pts)]
These have kinks and concavities occasionally. Such is the nature of this family of blobs.
Note that SciPy has convex hull and FFT, so the above functions could be substituted by them.
You could probably design algorithms to do this that are minor variants of a range of random maze generating algorithms. I'll suggest one based on the union-find method.
The basic idea in union-find is, given a set of items that is partitioned into disjoint (non-overlapping) subsets, to identify quickly which partition a particular item belongs to. The "union" is combining two disjoint sets together to form a larger set, the "find" is determining which partition a particular member belongs to. The idea is that each partition of the set can be identified by a particular member of the set, so you can form tree structures where pointers point from member to member towards the root. You can union two partitions (given an arbitrary member for each) by first finding the root for each partition, then modifying the (previously null) pointer for one root to point to the other.
You can formulate your problem as a disjoint union problem. Initially, every individual cell is a partition of its own. What you want is to merge partitions until you get a small number of partitions (not necessarily two) of connected cells. Then, you simply choose one (possibly the largest) of the partitions and draw it.
For each cell, you will need a pointer (initially null) for the unioning. You will probably need a bit vector to act as a set of neighbouring cells. Initially, each cell will have a set of its four (or eight) adjacent cells.
For each iteration, you choose a cell at random, then follow a pointer chain to find its root. In the details from the root, you find its neighbours set. Choose a random member from that, then find the root for that, to identify a neighbouring region. Perform the union (point one root to the other, etc) to merge the two regions. Repeat until you're happy with one of the regions.
When merging partitions, the new neighbour set for the new root will be the set symmetric difference (exclusive or) of the neighbour sets for the two previous roots.
You'll probably want to maintain other data as you grow your partitions - e.g. the size - in each root element. You can use this to be a bit more selective about going ahead with a particular union, and to help decide when to stop. Some measure of the scattering of the cells in a partition may be relevant - e.g. a small deviance or standard deviation (relative to a large cell count) probably indicates a dense roughly-circular blob.
When you finish, you just scan all cells to test whether each is a part of your chosen partition to build a separate bitmap.
In this approach, when you randomly choose a cell at the start of an iteration, there's a strong bias towards choosing the larger partitions. When you choose a neighbour, there's also a bias towards choosing a larger neighbouring partition. This means you tend to get one clearly dominant blob quite quickly.

Normal vector from least squares-derived plane

I have a set of points and I can derive a least squares solution in the form:
z = Ax + By + C
The coefficients I compute are correct, but how would I get the vector normal to the plane in an equation of this form? Simply using A, B and C coefficients from this equation don't seem correct as a normal vector using my test dataset.
Following on from dmckee's answer:
a x b = (a2b3 − a3b2), (a3b1 − a1b3), (a1b2 − a2b1)
In your case a1=1, a2=0 a3=A b1=0 b2=1 b3=B
so = (-A), (-B), (1)
Form the two vectors
v1 = <1 0 A>
v2 = <0 1 B>
both of which lie in the plane and take the cross-product:
N = v1 x v2 = <-A, -B, +1> (or v2 x v1 = <A, B, -1> )
It works because the cross-product of two vectors is always perpendicular to both of the inputs. So using two (non-colinear) vectors in the plane gives you a normal.
NB: You probably want a normalized normal, of course, but I'll leave that as an exercise.
A little extra color on the dmckee answer. I'd comment directly, but I do not have enough SO rep yet. ;-(
The plane z = Ax + By + C only contains the points (1, 0, A) and (0, 1, B) when C=0. So, we would be talking about the plane z = Ax + By. Which is fine, of course, since this second plane is parallel to the original one, the unique vertical translation that contains the origin. The orthogonal vector we wish to compute is invariant under translations like this, so no harm done.
Granted, dmckee's phrasing is that his specified "vectors" lie in the plane, not the points, so he's arguably covered. But it strikes me as helpful to explicitly acknowledge the implied translations.
Boy, it's been a while for me on this stuff, too.
Pedantically yours... ;-)