Storing vector coordinates in MySQL - mysql

I am creating a database to track the (normalized) coordinates of events within a coordinate system. Think: a basketball shot chart, where coordinates of shot attempts are stored relative to where they were taken on the basketball court, in both positive and negative directions from center court.
I'm not exactly sure the best way to store this information in a database in order to give myself the most flexibility in utilizing the data. My options are:
Store a JSON object in a TEXT/CHAR column with X and Y properties
Store each X and Y coordinate in two DECIMAL columns
Use MySQL's spatial POINT object to store the coordinate
My goal is to store a normalized vector2 (as a percentage of the bounding box), so I can map the positions back out onto a rectangle of any size.
It would be nice to be able to do calculations, like distance from another point, but my understanding of spatial objects is that it is more for geographical coordinates than a normalized vector. The other options, however, make calculations a bit more difficult though, currently for my project, they aren't a definitive requirement.
Is it possible to use spatial POINT for this and would calculations be similar to that of measuring geographical points?

It is possible to use POINT, but it may be more of a hassle retrieving or modifying the values as it is stored in binary form. You won't be able to view or modify the field directly; you would use an SQL statement to get the components or create a new POINT to replace the old one.
They are stored as numbers and you can do normal mathematical operations on them. Geospatial-type calculations on distance would use other geospatial data types such as LINESTRING.
To insert a point you would have to create a point from two numbers (I think for your case, there would be no issues with the size of the numbers) :
INSERT INTO coordinatetable(testpoint) VALUES (GeomFromText('POINT(-100473882.33 2133151132.13)'));
INSERT INTO coordinatetable(testpoint) VALUES (GeomFromText('POINT(0.3 -0.213318973)'));
To retrieve it you would have to select the X and Y value separately
SELECT X(testpoint), Y(testpoint) from coordinatetable;
For your case, I would go with storing X and Y coordinate in two DECIMAL columns. It's easier to retrieve, modify and having X and Y coordinates separate would allow you direct access to to the coordinates rather than extract the values you want from data stored in a single field. For larger data sets, it may speed up your queries.
For example:
Whether the player is past half court only requires Y-coordinate
How much help the player could possibly get from the backboard would rely more on the X-coordinate than the Y-coordinate (X closer to zero => Straighter shot)
Whether the player usually scores from locations close to the long edges of the court would rely more on the X-coordinate than the Y-coordinate (X approaches 1 or -1)

Related

Determining the convex hull in the presence of outliers

I made a software to create and optimize a racing line in a racetrack.
Now I want to integrate it using real data recorded from GPS, so I need to obtain the g-g diagram, where g is the acceleration. The real g-g diagram is a set of points, in a scatter graph. I need to obtain the contour of that scatter plot, to use it as boundary of limits accelerations.
To obtain data to work on it I recorded myself on two different racetrack.
The code I wrote translate the x-y coordinate to polar R-theta.
Then I divide the circle in a definite number of sector (say, 20).
I calculate the histogram of all R's values in each sector, then from histogram I take the last value with an acceptable number of samples.
Then I draw these lines, and this is the result:
It's not bad, but this boundary is a little inside from the real data, real acceleration is a little bit bigger. I cannot take only the max value, because in this way I take in consideration the absurd values (like 3g in right corner, for sure an error). Moreover, the limit change if I change the number of bins on the histogram, but I cannot find a way to choose the right number of bins.
How can I determine the "true" convex hull, ignoring the outliers?

Best way to store a 3D Vector in MySql to grab entries within a distance?

I am storing entries inside of a database and I would like to store a 3D vector in a field and then later select all rows within X distance of a 3D Vector given. I am thinking storing X,Y, and Z in its own fields and then doing basic greater then and less then signs, but is there a better way I am overlooking?
You'll have to do the math every time if the start point is totally random. If you have a large dataset you could optimize by having a pre analysis for clusters of points which fall inside some minimum distance. Then you could avoid computations on all the points in a cluster.
I think that if you store your values in polar co-ordinates then there might be an optimization on the distance computation which reduces the number of computations.

Calculating the centroid from points in a MySQL-table

In a MySQL-table I have a column of the "point" geospatial datatype. Is it posssible to calculate the centroid of all the point-values of all rows directly in MySQL?
The aim of my project is to put the center of a map at the centroid of the points that it contains.
One potential solution is given in the MySQL documentation: Centroid(mpoly). But this would mean that I had to concatenate all points' values externally in a programming language and then send the resulting query back to MySQL. This sounds quirky to me.
A centroid is simply the intersection of the mean X and Y values, so the following should work:
SELECT
POINT( AVG(X(geographic_location)), AVG(Y(geographic_location)) )
FROM poles

Storing millions of 3D coordinates in MySQL - bad idea?

All-
So I need to store 3D positions (x, y, z) associated with objects in a video game.
I'm curious, is this a terrible idea? The positions are generated quite frequently, and may vary some.
I basically would ONLY like to store the position in my database if it's not within a yard of a position already stored.
I was basically selecting the existing positions for an object in the game (by object_id, object_type, continent and game_version), looping through, and calculating the distance using PHP. If It was > 1, I would insert it.
Now that i'm at about 7 million rows (obviously not for the same object), this isn't efficient and the server I'm using is coming to a crawl.
Does anyone have any ideas on how I could better store this information? I'd prefer it be in MySQL somehow.
Here is the structure of the table:
object_id
object_type (like unit or game object)
x
y
z
continent (an object can be on more than one continent)
game_version (positions can vary based on the game version)
Later when I need to access the data, I basically only query it by object_id, object_type, continent, and game_version (so I have an index on these 4)
Thanks!
Josh
Presumably objects on different continents are considered infinitely far apart. Also you haven't disclosed the units you're using in your table. I'll assume inches (of which there are 36 in a yard).
So, before you insert a point you need to determine whether you're within a yard. To do this you're going to need either the MySQL geo extension (which you can go read about) or separate indexes on at least your x and y columns, and maybe the z column.
Are there any points within a yard? This query will get you whether there are any points within the bounding box of +/- one yard around your new point. A 'nearby' result of one or more means you shouldn't insert the new point.
SELECT COUNT(*) nearby
FROM table t
WHERE t.x between (?xpos - 36) AND (?xpos + 36)
AND t.y between (?ypos - 36) AND (?ypos + 36)
AND t.z between (?zpos - 36) AND (?zpos + 36)
AND t.continent = ?cpos
If you need the query to work with Cartesian distances rather than bounding boxes you can add a sum-of-squares distance computation. But I suspect bounding boxes will work just fine for your app, and be much more efficient than repeatedly fetching 75-row result sets to do proximity testing in your application.
Conceptually it wouldn't be much harder to create a stored procedure for MySQL that would conditionally insert the new row only if it met the proximity criteria. That way you'd have a simple one-way transaction rather than server back-and-forth.
It may be killing your server because of the continuous activity on the disk that could be fixed by having mysql work in memory, add: ENGINE = MEMORY; on your table def.

Get polygons close to a lat,long in MySQL

Does anyone know of a way to fetch all polygons in a MySQL db within a given distance from a point? The actual distance is not that important since it's calculated for each found polygon later, but it would be a huge optimization to just do that calculation for the polygons that are "close".
I've looked at the MBR and contains functions but the problem is that some of the polygons are not contained within a bounding box drawn around the point since they are very big, but some of their vertices are still close.
Any suggestions?
A slow version (without spatial indexes):
SELECT *
FROM mytable
WHERE MBRIntersects(mypolygon, LineString(Point(#X - #distance, #Y - #distance), Point(#X + #distance, #Y + #distance))
To make use of the spatial indexes, you need to denormalize your table so that each polygon vertex is stored in its own record.
Then create the SPATIAL INDEX on the field which contains the coordinates of the vertices and just issue this query:
SELECT DISTINCT polygon_id
FROM vertices
WHERE MBRContains(vertex, LineString(Point(#X - #distance, #Y - #distance), Point(#X + #distance, #Y + #distance))
The things will be much more easy if you store UTM coordinates in your database rather than latitude and longitude.
I don't think there's a single answer to this. It's generally a question of how to organize your data so that it makes use of the spacial locality inherent to your problem.
The first idea that pops into my head would be to use a grid, assign each point to a square, and check select the square the point is in, and those around it. If we're talking infinite grids, then use a hash-value of the square, this would give you more points than needed (where you have collisions), but will still reduce the amount by a bunch. Of course this isn't immediately applicable to polygons, it's just a brainstorm. A possible approach that might yield too many collisions would be to OR all hashed values together and select all entries where the hashes ANDed with that value is non-zero (not sure if this is possible in MySQL), you might want to use a large amount of bits though.
The problem with this approach is, assuming we're talking spherical coordinates (lat, long generally does) are the singularities, as the grid 'squares' grow narrower as you approach the poles. The easy approach to this is... don't put any points close to the poles... :)
Create a bounding box for all of the polygons and (optionally storing these results in the database will make this a lot faster for complex polygons). You can then compare the bounding box for each polygon with the one round the point at the desired size. Select all the polygons which have intersecting bounding boxes.