How to use MySQL geospatial extensions with spherical geometries - mysql

I would like to store thousands of latitude/longitude points in a MySQL db. I was successful at setting up the tables and adding the data using the geospatial extensions where the column 'coord' is a Point(lat, lng).
Problem:
I want to quickly find the 'N' closest entries to latitude 'X' degrees and longitude 'Y' degrees. Since the Distance() function has not yet been implemented, I used GLength() function to calculate the distance between (X,Y) and each of the entries, sorting by ascending distance, and limiting to 'N' results. The problem is that this is not calculating shortest distance with spherical geometry. Which means if Y = 179.9 degrees, the list of closest entries will only include longitudes of starting at 179.9 and decreasing even though closer entries exist with longitudes increasing from -179.9.
How does one typically handle the discontinuity in longitude when working with spherical geometries in databases? There has to be an easy solution to this, but I must just be searching for the wrong thing because I have not found anything helpful.
Should I just forget the GLength() function and create my own function for calculating angular separation? If I do this, will it still be fast and take advantage of the geospatial extensions?
Thanks!
josh
UPDATE:
This is exactly what I am describing above. However, it is only for SQL Server. Apparently SQL Server has a Geometry and Geography datatypes. The geography does exactly what I need. Is there something similar in MySQL?

How does one typically handle the discontinuity in longitude when working with spherical geometries in databases?
Not many people use MySQL for this, because it's geospatial extensions aren't really up to snuff.
From the docs:
"All calculations are done assuming Euclidean (planar) geometry."
The solution is usually to roll your own.
Alternatively, you can fake it -- if your distances are less than a 500 miles or so, then you can treat your latitude and longitude as rectangular coordinates and just use the euclidean distance formula (sqrt(a^2 + b^2)).

Related

Given a user's lat lng, how to find the nearest lat lng from a database of thousands of lat lng?

I have data of locations of thousands of sensors in MySQL. I want to identify the sensor closest to the user's location and show that specific sensor's data. All the location data is available as lat lng.
I understand that one approach can be to find displacements between the origin and all the sensors using Haversine formula and select the one with the shortest distance. The problem here is that there are tens of thousands of sensors.
Any suggestions/leads?
Spatial index allows efficient query of points within any specific distance. The problem of course is one might not know the search radius needed in specific case. Unfortunately, a large radius causes inefficient queries, and a small radius might result in no match at all.
A possible solution is to search with increasing radius, until the search returns some results, and then find the closest result among those.
This article describes this solution for BigQuery, would require some adaptation for MySQL script dialect:
https://mentin.medium.com/nearest-neighbor-using-bq-scripting-373241f5b2f5
Not the MySQL answer you are looking for but Postgresql's popular PostGIS extension has an inbuilt K Nearest Neighbor operator class). Also, see its documentation. It works great!
Also, I am aware of this Go library that allows you to do KNN in memory after building a Quadtree with your sensor locations.
For only thousands, a simple bounding box with two 2-column indexes may be fast enough.
For better speed, see SPATIAL indexing.
For details on those two solutions, plus two faster ones, see Find Nearest

How to cluster latitude-longitude data based on fixed radius from centroid as the only constraint?

I have around 200k latitude & longitude data points. How can I cluster them so that each clusters have latitude & longitude points strictly within radius = 1 km from centroid only?
I tried leadercluster algorithm/package in R but eventhough I specify radius =1 km its not strictly enforcing it i.e. its give clusters with lot of point say 5 - 10 kms from cluster centroid also within the same cluster. So its not meeting my requirement.
Number of points in a cluster can vary & its not problem.
Is there a way to enforce the strict radius constraint in heirarchical or another clustering algorithm? I am looking for the steps & implementation in R/python.
I tried searching in stackoverflow but couldn't find a solution in r/python.
How to visualize cluster centroids in google maps after the clustering in done?
EDIT
Parameters I am using in ELKI. Please verify
This is not so much a clustering, but a set cover type of problem. At least if you are looking for a good cover. A clustering algorithm is about finding structure in your data; but you are looking for some forced quantization.
Anyway, here are two strategies you can try e.g. in ELKI:
Canopy preclustering with T1=T2=your radius. This should yield a greedy approximation to the cover scenario.
Complete linkage hierarchical agglomerative clustering, cut at the desired height. This is fairly expensive (O(n^3)). Any two points in the same cluster have at most this distance, so this is a bit stricter than your requirement.
Beware that you should be using haversine ("geo") distances, not Euclidean!

Is there any MySQL function to directly get 5 closest coordinates to a given coordinate from database?

I am working with PHP and use MySQL for database. I need a way, to get 5 closest coordinates to a given coordinate from database, which is very fast and at least 80-90% accurate. I have researched a lot. I found havershine formula, spherical law of cosines, bounding square method to compare min and max latitude-longitude values with coordinate in database and other methods which use trigonometric math functions. But all these formulas take a long to return result in database with thousands of entries. Does MySQL provide any function to do it fast?
See this similar question on the GIS Stack site. The performance of your ultimate solution will depend on how many targets are in the reference table you are searching and if you can limit the distance you are interested in (such as closest 5 within 30 miles). I don't think you can reliably optimize the process; you need to calculate the distance for all coordinates in your reference table.

SQL search results by distance

I need some help, I've never done my own SQL search before and I'm trying to do this:
I have a database of names and locations (the locations are listed with a Latitude record and a Longitude record). Then, a user can search by entering their zip code (which is converted to longitude and latitude) and a distance they're willing to travel (in miles, which I can convert to lon/lat distance).
How can I return the results ordered by the distance away from their ZipCode?
Please keep in mind, I haven't ever done anything like this before.
There's a mathematical formula for figuring the shortest distance between two points on a sphere. The formula and a JS implementation of it are here:
http://www.movable-type.co.uk/scripts/latlong.html
A T-SQL implementation is here:
http://weblogs.asp.net/jimjackson/archive/2009/02/13/calculating-distances-between-latitude-and-longitude-t-sql-haversine.aspx

MySQL Lat/Lon radius search

I have a table with zipcode(int) and Location(point). I'm looking for a MySql query or function. Here is an example of the date. I'd like to return 100 miles.
37922|POINT(35.85802 -84.11938)
Is there an easy query to achieve this?
Okay so I have this
select x(Location), Y(Location) FROM zipcodes
This will give me my two points, but how do i figure out whats within a distance of x/y?
The query to do this is not too hard, but is slow. You would want to use the Haversine formula.
http://en.wikipedia.org/wiki/Haversine_formula
Converting that to SQL should not be too difficult, but calculating the distance for every record in a table gets costly as the data set increases.
The work can be significantly reduced by using a geohash function to limit the locus of candidate records. If accuracy is important, the Haversine formula can be applied to the records inside a geohash region.
If the mysql people never completed their GIS and Spatial extension, consider using ElasticSearch or MongoDB.
There is a pretty complete discussion here:
Formulas to Calculate Geo Proximity