PCA in 2D calculate center point in original data - center

I'm trying to create a bounding box around a given dataset.
My Idea therefore was to use a PCA. I read that it won't always find optimal solutions but this doesn't matter.
What I've done so far is that I calculate the covariance-matrix and use it to calculate a SVD of this matrix.
Lets say we have a sample input like
[40, 20], [-40, -20],[40, -20],[-40, 20],[30, 30]
The covariance matrix will become
[1780.0, 180.0] [180.0, 580.0]
With the SVD I get the rotation matrix U:
[0.99, 0.15]
[0.15, -0.99]
and the diagonal matrix D:
[1806.41, 0]
[0, 553.58]
With my eigenvectors I'm able to calculate the slope of the lines representing the box.
I now need to get the center of the PCA in the original space not in the 0-centered space.
And I also need to find out the length of those to vectors.
Does anyone has an idea how to get them?

Interesting question.Just some thoughts.
Is the centre you are referring to the mean of the data?
Think it this way, if we can project back (0,0) to the original space, it's the mean.
To find the length, assuming you are trying to include every point in the box, you can project every point in each principle component direction and record the largest and smallest coordinates. The difference will be the length.
By the way, I am under the impression that PCA on correlation matrix is usually the more appropriate choice and I think that applies to your question too.

I found a solution.
The idea was to use the two eigenvectors to calculte the maximum distance of all point to it.
The maximum distance will than be half the length of the rectangles width and height. As shown in the picture below
To position the rectangle I calculate the 4 points by
p1.x = max1 * eigenvector1(0) + max2 * eigenvector1(1)
p1.y = max1 * eigenvector2(0) + max2 * eigenvector2(1)
for all points.
Than I just had to transform the vertices and all datapoints by meanX and meanY and the rectangle enclosing the original dataset.

The problem in the solution above was that using just max was not the best idea, because it will always just be minimal in one direction of the eigenvectors.
By using min and max I'm now able to create minimal enclosing boxes in both directions of the principal components.
To calculate the points I used the code below, where minDistX is the absolute value of the minimum distance:
p1.setX(minDist2 * U[0][0] + maxDist1 * U[0][1]);
p1.setY(minDist2 * U[1][0] + maxDist1 * U[1][1]);
p2.setX(minDist2 * U[0][0] - minDist1 * U[0][1]);
p2.setY(minDist2 * U[1][0] - minDist1 * U[1][1]);
p3.setX(-(maxDist2 * U[0][0] + minDist1 * U[0][1]));
p3.setY(-(maxDist2 * U[1][0] + minDist1 * U[1][1]));
p4.setX(-(maxDist2 * U[0][0] - maxDist1 * U[0][1]));
p4.setY(-(maxDist2 * U[1][0] - maxDist1 * U[1][1]));

Related

what's the appropriate algorithm for locating places using Cartesian coordinate system

what's the algorithm to be able locate and display places around me within a particular distance such as 100m,using easting and northing and name of the place where I'm based .
To be more clear, lets suppose I'm based in charing cross and I want to find all places within 100m using easting and northing data for example, easting =10000m and easting=20000m.
Thank you
Pythagoras is the relevant maths.
If your position is (x,y) then you can calc a distance to any other point (x2,y2) with:
distance = sqrt((x2-x)^2 + (y2-y)^2)
So you could just loop over all points, calc their distance and order the results by nearest first.
For large data sets this may become impractical, in which case you'll want to partition the points into large rectangles. The first stage then is to identify which rectangle your (x,y) is within and the adjacent rectangles, then loop through all points in those rectangles. You need the adjacent rectangles because your (x,y) might be right on the boundary of its rectangle.
More generally this partitioning approach comes under the general heading of spatial hashing. For very large areas you want a tree structure known as a quadtree, that breaks large areas down into smaller and smaller regions, but that might be overkill for what you want.
I am assuming by Cartesian coordinates you also mean linear. If you are trying to do this using actual earth coordinates this answer gets more complicated (as we aren't on a flat earth). For simple linear coordinates you could do something like:
bool contains( x, y)
{
return (x >= minx) && (x <= maxx) && (y >= miny) && (y <= maxy);
}
The min, max coordinates would be your current position + how far out you wanted to go. I think this is what you wanted to know. If you need accurate earth coordinates you might look into some geospatial libraries. If you need and estimate you can still use the algorithm above but I would use something like Rhumb lines to calculate the min, max coordinates.

calculating the point of acceleration

I've been struggling to calculate the accelerator. I've spend a whole day in searching, trial & error but all in vain. I've one horizontal line on the stage (AS3) of let say 200 width. Center-point of that line is on 60 (if it was 100, I would have surely done it by just calculating the percentage). Now I need to know the width of given percentage. For example, total width of 60% or where will 30% (or any other percentage) start from?
What I know is the total width, and the center-point (either in percentage or in width).
Your help will be highly appreciated. In case if there is any formula, please give me details, don't just mention a/b/c as I'd never been a student of physics :(
Edit:
I don't have 10 reputations, so I can't post image directly here. Please click the following link to see the image.
Link: http://oi62.tinypic.com/11sk183.jpg
Edit:
Here is what I want exactly: I want to travel n% from any point (A/B/C/D) to its relative point (A->B/A->D ...) (Link)
http://i59.tinypic.com/2wp2lbl.jpg
If I understand correctly, you want a non-linear scale, so that pixel 1 on the line is 0%, pixel 100 on the line is 60% and pixel 200 is 100%?
If x=pixelpos/200 is the relative position on the line, one easy variation of the linear scale y=x*100% is y=(x+a*x*(1-x))*100%.
For x=0.5 the value is y=0.5+a*0.25, so for that to be 0.6=60% one needs a=0.4.
To get in the reverse direction the x for y=0.3=30%, one needs to solve a quadratic equation y=x*(1+a*(1-x)) or a*x^2-(1+a)*x+y=0. With the general solution formula, this gives
x = (1+a)/(2*a)-sqrt((1+a)^2-4*a*y)/(2*a)
= (2*y) / ( (1+a) + sqrt((1+a)^2-4*a*y) )
= (2*y) / ( (1+a) + sqrt((1-a)^2+4*a*(1-y)) )
and with a=0.4 and y=0.3
x = 0.6/( 1.4 + sqrt(1.98-0.48) )
approx 0.6/2.6=3/13=231/1001 approx 0.23
corresponding to pixel 46.
This will only work for a between -1 and 1, since for other values the slope at x=0 or x=1 will not be positive.
Another simple formula uses hyperbola instead of parabola,
y=a*x/(1+(a-1)*x)
with the inversion by
y+(a-1)*x*y = a*x <=> y = (a-(a-1)*y)*x
x = (y/a)/(1+(1/a-1)*y)
and
a = (y*(1-x))/(x*(1-y))
here there is no problem with monotonicity as long as there is no pole for x in [0,1], which is guaranteed for a>0.

Probability of intersection of two users with horizontal accuracy and vision area

I'm receiving data from GPS and store them in MySQL database. I have following columns:
User ID (int)
Latitude (double)
Longitude (double)
Horizontal Accuracy (double)
Horizontal accuracy is radius around Lat/Long, so my user with equivalent probability can be in any point of this area.
I need to find out probability that two users was intersecting. But I also have vision area, which is 30 meters. If horizontal accuracy would be 0 I could just measure area of intersection of two circles that have radius of 30 meters around lat/long. But in my case that's not possible because horizontal accuracy could be in range from 5 to 3000. Usually it's more than my vision area.
I think I can measure area of intersection of two cones where inner circle of this cone will have radius of horizontal accuracy + 30 meters and outer circle will have radius of horizontal accuracy. But this solution seems to be little bit complicated.
I want to hear some thoughts about that and other possible solution.
I've checked MySQL Spatial extension and as far I can see it can't do such calculations for me.
Thanks.
I worked on just such a problem as you are describing. How I approached it was to convert the Lat/Long (world coordinates) into X/Y (Cartesian coordinates) then I applied the Pythagorean Theorem a^2 + b^2 = c^2 to solve the problem.
First you need to convert the Lat/Long Coordinates.
To get X you Multiply the Radius by the cosine (cos) of the angle (NOTE: this angle has to be expressed as radians).
To get Y you do the same as above but use the sine function (sin).
To convert degrees to radials Multiply the angle by the quantity of PI (Approx. 3.14159...) / 180.
Radians = Angle * (PI / 180);
To solve for the c^2 "C Squared" c = SQRT (a*a + b*b);
For more information on Degrees to Radians: http://www.mathwarehouse.com/trigonometry/radians/convert-degee-to-radians.php
For more information on: Converting Lat/Long to X/Y coordinates: http://www.mathsisfun.com/polar-cartesian-coordinates.html
I usually find the information that I need for this kind of problem by asking a question on ask.com.
All the best.
Allan

How do I zoom into the mandelbrot set?

I can generate a 400x400 image of the Mandelbrot set from minReal to maxReal and from minImaginary to maxImaginary. So,
makeMandel(minReal, maxReal, minImaginary, maxImaginary);
I need to modify it so that I can have,
makeMandel(centerX, centerY, Zoomlevel);
// generates a region of the mandelbrot set centered at centerX,centerY at a zoom level of Zoomlevel
(Considering zoom level represents the distance between the pixels and is given by the formula Zoom level n = 2 ^ (-n) so that zoom level 1 means pixels are 0.5 units apart, zoom level 2, 0.25 and so on...)
My question is how do I calculate the arguments of the first makeMandel function from the arguments of the second one?
I know that the first function is capable of zooming and moving around but I don't know how to calculate the correct numbers for any given center and zoom level.
I've been trying to get this working for more than three days now and I'm really confused. I tried drawing tables, etc... on paper and working it out.
I read most documents that you find on Google when searching for the mandelbrot set and a couple of past stackoverflow questions but I still don't understand. Please help me out.
You may solve it the following way. If you have the two definitions
centerX = (minReal + maxReal)/2
sizeX = maxReal - minReal
you can calculate extends on the axis via
minReal = centerX - sizeX/2
maxReal = centerX + sizeX/2
The size then is calculated using the zoomLevel:
sizeX = 2^(-zoomLevel) * baseSize
The same formulas hold for y and imaginary axis.
sizeY = 2^(-zoomLevel) * baseSize
minImaginary = centerY - sizeY/2
maxImaginary = centerY + sizeY/2
The only thing to define as a constant is your baseSize, i.e. the extend in real and imaginary axis when zoomLevel is zero. You may consider different baseSize in real and imaginary direction to cover an non-square aspect ratio of your image.

How can I better pack rectangles tangent to a sphere for a 3d gallery?

I am creating a 3D sphere gallery with ActionScript 3 and the Flash 10 3D (2.5D) APIs. I have found a method that works but is not ideal. I would like to see if there is a better method.
My algorithm goes like this:
Let n = the number of images
h = the height of each image
w = the width of each image
Approximate the radius of the circle by assuming (incorrectly) that the surface area of the images is equal to the surface area of the sphere we want to create.To calculate the radius solve for r in nwh = 4πr2. This is the part that needs to be improved.
Calculate the angle between rows. rowAngle = 2atan(h / 2 / r).
Calculate the number of rows.rows = floor(π / rowAngle).
Because step one is an approximation, the number of rows will not fit perfectly, so for presentation add padding rowAngle.rowAngle += (π - rowAngle * rows) / rows.
For each i in rows:
Calculate the radius of the circle of latitude for the row.latitudeRadius = radius * cos(π / 2 - rowAngle * i.
Calculate the angle between columns.columnAngle = atan(w / 2 / latitudeRadius) * 2.
Calculate the number of colums.columns = floor(2 * π / columnAngle)
Because step one is an approximation, the number of columns will not fit perfectly, so for presentation add padding to columnAngle.columnAngle += (2 * π - columnAngle * column) / column.
For each j in columns, translate -radius along the Z axis, rotate π / 2 + rowAngle * i around the X axis, and rotate columnAngle * j around the Y axis.
To see this in action, click here. alternate link. Notice that with the default settings, the number of items actually in the sphere are less by 13. I believe is the error introduced by my approximation in the first step.
I am not able to figure out a method for determining what the exact radius of such a sphere should be. I'm hoping to learn either a better method, the correct method, or that what I am trying to do is hard or very hard (in which case I will be happy with what I have).
I would divide this problem into two connected problems.
Given a radius, how do you pack things on to the sphere?
Given a number of things, how do you find the right radius?
If you have a solution to the first problem, the second is easy to solve. Here it is in pseudo-code.
lowerRadius = somethingTooSmall
fittedItems = itemsForRadius(lowerRadius)
while fittedItems < wantedItems:
lowerRadius *= 2
fittedItems = itemsForRadius(lowerRadius)
upperRadius = 2 * lowerRadius
while threshold < upperRadius - lowerRadius:
middleRadius = (upperRadius + lowerRadius)/2
if itemsForRadius(middleRadius) < wantedItems:
lowerRadius = middleRadius
else:
upperRadius = middleRadius
This will find the smallest radius that will pack the desired number of things with your packing algorithm. If you wish you could start with a better starting point - your current estimate is pretty close. But I don't think that an analytic formula will do it.
Now let's turn to the first problem. You have a very reasonable approach. It does have one serious bug though. The bug is that your columnAngle should not be calculated for the middle of your row. What you need to do is figure out the latitude which your items are in that is closest to the pole, and use that for the calculation. This is why when you try to fit 10 items you find a packing that causes the corners to overlap.
If you want a denser packing, you can try squishing rows towards the equator. This will result in sometimes having room for more items in a row so you'll get more things in a smaller sphere. But visually it may not look as nice. Play with it, and decide whether you like the result.
BTW I like the idea. It looks nice.
In the case of squares, it seems to be an approximate formula for knowing the relationship between the radius, the square's side and the number of squares embedded.
Following this, the number of squares is:
Floor[4 Pi/Integrate[(x^2 + y^2 + r^2)^(-3/2), {x, -a/2, a/2}, {y, -a/2, a/2}]]
or
Floor[(Pi r)/ArcCot[(2 Sqrt[2] r Sqrt[a^2+2 r^2])/a^2]]
where
r = Radius
a = Square side
If you plot for r=1, as a function of a:
Where you can see the case a=2 is the boundary for n=6, meaning a cube:
Still working to see if it can be extended to the case of a generic rectangle.
Edit
For rectangles, the corresponding formula is:
Floor[4 Pi/Integrate[(x^2 + y^2 + r^2)^(-3/2), {x, -a/2, a/2}, {y, -b/2, b/2}]]
which gives:
Floor[(2 Pi r)/(Pi-2 ArcTan[(2 r Sqrt[a^2+b^2+4 r^2])/(a b)])]
where
r = Radius
a,b = Rectangle sides
Let's suppose we want rectangles with one side half of the other (b = a/2) and a sphere of radius 1.
So, the number of rectangles as a function of a gives:
Where you may see that a rectangle with a "large" side of size 2 allows 10 rectangles in the sphere, while a rectangle of "large" side 4 allows only 4 rectangles.