How to get nearest coordinates from database in mysql? - mysql

I have got a table with id,latitude (lat),longitude (lng),altitude (alt).
I have some coordinates and I would like to find the closest entry in the DB.
I used this but not yet working correctly:
SELECT lat,ABS(lat - TestCordLat), lng, ABS(lng - TestCordLng), alt AS distance
FROM dhm200
ORDER BY distance
LIMIT 6
I have a table with the 6 nearest points displaying me the lattitude, longtitude and altitude.

Query to get nearest distance in kilometer (km) from mysql:
SELECT id, latitude, longitude, SQRT( POW(69.1 * (latitude - 4.66455174) , 2) + POW(69.1 * (-74.07867091 - longitude) * COS(latitude / 57.3) , 2)) AS distance FROM ranks ORDER BY distance ASC;
You may wish to limit radius by HAVING syntax.
... AS distance FROM ranks HAVING distance < '150' ORDER BY distance ASC;
Example:
mysql> describe ranks;
+------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+----------------+
| id | int | NO | PRI | NULL | auto_increment |
| latitude | decimal(10,8) | YES | MUL | NULL | |
| longitude | decimal(11,8) | YES | | NULL | |
+------------+---------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
mysql> SELECT id, latitude, longitude, SQRT( POW(69.1 * (latitude - 4.66455174) , 2) + POW(69.1 * (-74.07867091 - longitude) * COS(latitude / 57.3) , 2)) AS distance FROM ranks ORDER BY distance ASC;
+----+-------------+--------------+--------------------+
| id | latitude | longitude | distance |
+----+-------------+--------------+--------------------+
| 4 | 4.66455174 | -74.07867091 | 0 |
| 10 | 4.13510880 | -73.63690401 | 47.59647003096195 |
| 11 | 6.55526689 | -73.13373892 | 145.86590936973073 |
| 5 | 6.24478548 | -75.57050110 | 149.74731096011348 |
| 7 | 7.06125013 | -73.84928550 | 166.35723903407165 |
| 9 | 3.48835279 | -76.51532198 | 186.68173882319724 |
| 8 | 7.88475514 | -72.49432589 | 247.53456848808233 |
| 1 | 60.00001000 | 101.00001000 | 7156.836171031409 |
| 3 | 60.00001000 | 101.00001000 | 7156.836171031409 |
+----+-------------+--------------+--------------------+
9 rows in set (0.00 sec)

You will need to use the Haversine formula to calculate distances taking into account the latitude and longitude:
dlon = lon2 - lon1
dlat = lat2 - lat1
a = (sin(dlat/2))^2 + cos(lat1) * cos(lat2) * (sin(dlon/2))^2
c = 2 * atan2( sqrt(a), sqrt(1-a) )
distance = R * c (where R is the radius of the Earth)
However, the altitude raises the difficulty of the problem. If between point A and point B, having different altitudes the road contains a lot of high altitude differences, then assuming that the altitude's line's derivative between the two points is unchanged might be misleading, not taking that into account at all might be very misleading. Compare the distance between a point in China and a point in India, having the Himalaja in between with the distance between two points on the surface of the Pacific ocean. A possibility would be to vary R to be the average of the altitudes for each comparisons, but in case of large distances this could be misleading, as discussed earlier.

Related

MySQL MIN query not working for calculated distance

I have a table of locations in my database. I need a query to find out the nearest location, provided any coordinates. I wrote the following query to get all rows, along with their respective distance from a given coordinate(distance in meters):
SELECT *, 111111 * DEGREES(ACOS(LEAST(COS(RADIANS(dest.latitude)) * COS(RADIANS(8.584710)) * COS(RADIANS(dest.longitude - 76.868735)) + SIN(RADIANS(dest.latitude)) * SIN(RADIANS(8.584710)), 1.0))) as distance FROM offer dest;
It gives the following output:
+----+------------------------+----------+-----------+------------+---------------------+
| id | description | latitude | longitude | name | distance |
+----+------------------------+----------+-----------+------------+---------------------+
| 2 | Location 1 Description | 8.574858 | 76.874748 | Location 1 | 1278.565430298969 |
| 12 | Location 2 Description | 8.584711 | 76.868738 | Location 2 | 0.35494725284463646 |
+----+------------------------+----------+-----------+------------+---------------------+
It is all working fine. Now to get the Minimum distance, I added HAVING MIN(distance) to this query. Now the query looks like below:
SELECT *, 111111 * DEGREES(ACOS(LEAST(COS(RADIANS(dest.latitude)) * COS(RADIANS(8.584710)) * COS(RADIANS(dest.longitude - 76.868735)) + SIN(RADIANS(dest.latitude)) * SIN(RADIANS(8.584710)), 1.0))) as distance FROM offer dest having MIN(distance);
Now, this query is supposed to return 1 row and that should be Location 2, as it has the the minimum location, but this is returning location 1 instead as seen below:
+----+------------------------+----------+-----------+------------+---------------------+
| id | description | latitude | longitude | name | distance |
+----+------------------------+----------+-----------+------------+---------------------+
| 2 | Location 1 Description | 8.574858 | 76.874748 | Location 1 | 1278.565430298969 |
+----+------------------------+----------+-----------+------------+---------------------+
Why is this behaving so? Is there something wrong with my query? IF yes, what is it and how do I get the location with minimum distance.
A HAVING-clause is used to filter conditions for a group. A group is defined with an aggregate function in SELECT-part and with a GROUP BY. As you do not have either of those, you should not use HAVING.
If you want to show the minimum distance from set of rows order by the distance and limit the result set just to one row.
SELECT *,
111111 * DEGREES(ACOS(LEAST(COS(RADIANS(dest.latitude)) *
COS(RADIANS(8.584710)) * COS(RADIANS(dest.longitude - 76.868735)) +
SIN(RADIANS(dest.latitude)) * SIN(RADIANS(8.584710)), 1.0))) as distance
FROM offer dest
ORDER BY distance
LIMIT 1;

Find the nearest point from a given point in MySQL

I have a set of points from which I created a multipoint.
SET #multi_point = ST_GeomFromText('MULTIPOINT(-118.2845938 34.0252385, -118.2867610 34.0221188, -118.2905912 34.0227248, -118.284119 34.021846, -118.2864676 34.0186438, -118.2886342 34.0203211, -118.2907290 34.0193680, -118.2831326 34.0192874, -118.2828242 34.0205473)');
I also have a single other point
SET #home = ST_GeomFromText('POINT(-118.2819136 34.0261177)')
From #home I want to find the nearest 3 points. that are a part of the multi-point #multi_point. I thought of first determining the distance from this point #home to every point in the multipoint using ST_DISTANCE. But I get the following:
SELECT ST_DISTANCE(#home, #multi_point)
+----------------------------------+
| ST_DISTANCE(#home, #multi_point) |
+----------------------------------+
| 0.0028207205958764945 |
+----------------------------------+
It looks like I only get the shortest distance of the point to the multipoint. Questions:
Is there a way to get the shortest distance to every point, and determine the points themselves?
Is there a better way to determine the nearest neighbors from one point #home to the rest in #multi_point?
EDIT I tried Paul Spiegel's of storing these points in a table. But when I store a Point type, it is stored as a garbage value:
+-----------------+---------------------------+
| Name | Coordinates |
+-----------------+---------------------------+
| P1 | [2]pxA# |
| p2 | +1Z]:PA# |
| p3 | a]KA# |
| p4 | X ]qgpA# |
| p5 | EW3|U]A# |
| p6 | M]bKzA# |
| p7 | Ku/]2ƇA# |
| p9 | =];A# |
| p8 | u`x]A# |
+-----------------+---------------------------+
Also, When I query it, the distances are null.
SELECT P.Name, ST_DISTANCE(#home, P.Coordinates) AS dist
FROM Placemark P ORDER BY dist LIMIT 3;
+--------------+------+
| Name | dist |
+--------------+------+
| p1 | NULL |
| p2 | NULL |
| p3 | NULL |
+--------------+------+
How do I fix this?
Here is an example using a temporary table:
drop temporary table if exists Placemark;
create temporary table Placemark(
Name varchar(50),
Coordinates point
);
insert into Placemark(Name, Coordinates) values
('p1', POINT(-118.2845938, 34.0252385)),
('p2', POINT(-118.2867610, 34.0221188)),
('p3', POINT(-118.2905912, 34.0227248)),
('p4', POINT(-118.284119 , 34.021846 )),
('p5', POINT(-118.2864676, 34.0186438)),
('p6', POINT(-118.2886342, 34.0203211)),
('p7', POINT(-118.2907290, 34.0193680)),
('p8', POINT(-118.2831326, 34.0192874)),
('p9', POINT(-118.2828242, 34.0205473));
SET #home = ST_GeomFromText('POINT(-118.2819136 34.0261177)');
SELECT
P.Name,
ST_AsText(P.Coordinates) as Coordinates,
ST_DISTANCE(#home, P.Coordinates) AS dist
FROM Placemark P
ORDER BY dist
LIMIT 3
The result:
Name | Coordinates | dist
p1 | POINT(-118.2845938 34.0252385) | 0.00282072059587649
p4 | POINT(-118.284119 34.021846) | 0.00480741199088132
p9 | POINT(-118.2828242 34.0205473) | 0.00564433773972052
Demo: http://rextester.com/GMPK57301
To define a POINT use either POINT(-118.2845938, 34.0252385) or ST_GeomFromText('POINT(-118.2845938 34.0252385)'). To see the value in a readable form use ST_AsText().

mysql SELECT MIN from WHERE result

I have a table with several routes which has severeal points defined by lattitude and longitude.
table name: route_path
|id_route |id_point| lat | lng |
|hhVFlBFA0M| 328| 48.90008 | 18.0233 |
|hhVFlBFA0M| 329| 48.90003 | 18.0268 |
|hhVFlBFA0M| 330| 48.89997 | 18.02856 |
|hhVFlBFA0M| 331| 48.89991 | 18.02857 |
|hhVFlBFA0M| 332| 48.89986 | 18.02862 |
|hhVFlBFA0M| 333| 48.89982 | 18.02869 |
|hhVFlBFA0M| 334| 48.89981 | 18.02878 |
|hhVFlBFA0M| 335| 48.89981 | 18.02886 |
|hhVFlBFA0M| 336| 48.89956 | 18.02925 |
|hhVFlBFA0M| 337| 48.89914 | 18.02972 |
|hhVFlBFA0M| 338| 48.8986177 | 18.0302365|
|3toCyDGVV2| 1| 48.134166 | 17.1051961|
|3toCyDGVV2| 2| 48.13417 | 17.1052 |
|3toCyDGVV2| 3| 48.13344 | 17.10559 |
|3toCyDGVV2| 4| 48.13298 | 17.10609 |
|3toCyDGVV2| 5| 48.13221 | 17.10699 |
|3toCyDGVV2| 6| 48.132 | 17.10806 |
|3toCyDGVV2| 7| 48.13193 | 17.10997 |
|3toCyDGVV2| 8| 48.13203 | 17.1109 |
|3toCyDGVV2| 9| 48.132 | 17.1 112 |
|3toCyDGVV2| 10| 48.13181512| 17.1112 |
|3toCyDGVV2| 11| 48.13181 | 17.10806 |
|3toCyDGVV2| 12| 48.13181 | 17.10806 |
|3toCyDGVV2| 13| 48.13197 | 17.10399 |
|3toCyDGVV2| 14| 48.13199 | 17.10352 |
|3toCyDGVV2| 15| 48.1323 | 17.10328 |
So far I can do it to select all rows from one route which are within tolerated distance and then loop to find minimal distance point.
SELECT * FROM route_path
WHERE
(((lat < $start_lat + $tolerance) AND
(lat > $start_lat - $tolerance)) AND
((lng < $start_lng + $tolerance) AND
(lng > $start_lng - $tolerance)))
So this will results in several rows (id_points) of each route and then I need to loop with while to find minimal.
How can I found out select one row (one id_point) from each route with minimal distance from start lat and lng considering this distance is not more then some value.
Any suggestion for sql request without looping.
Basically I need something like, but of course it is not possible to use MIN after WHERE
SELECT * FROM route_path WHERE **MIN(**(((lat < $start_lat + $tolerance) AND (lat > $start_lat - $tolerance)) AND ((lng < $start_lng + $tolerance) AND (lng > $start_lng - $tolerance)))**)**
There are a few ways to calculate the distance between 2 points. The most efficient are probably using spatial data types which are designed for this and have indexes for this. I am not yet that experience with these so if you want to alter your database to use these I will just point you at this previous question to get the basics (the accepted answer covers it):-
Fastest Way to Find Distance Between Two Lat/Long Points
If you want to use your table as it currently stands then you can get the distance in km between 2 points with the following calculation:-
111.045 * DEGREES(ACOS(COS(RADIANS(lat_point_1))
* COS(RADIANS(lat_point_2))
* COS(RADIANS(long_point_1) - RADIANS(long_point_2))
+ SIN(RADIANS(lat_point_1))
* SIN(RADIANS(lat_point_2))))
(taken from here).
Using this if you wanted to know the closest point on a particular route to your starting point you could use this (no need to multiply by 111.045 unless you care about the actual distance rather than it just being the closest one):-
SELECT id_route,
id_point,
lat,
lng,
DEGREES(ACOS(COS(RADIANS($start_lat))
* COS(RADIANS(lat))
* COS(RADIANS($start_lng) - RADIANS(lng))
+ SIN(RADIANS($start_lat))
* SIN(RADIANS(lat)))) AS distance_in_km
FROM route_path
WHERE id_route = 'hhVFlBFA0M'
ORDER BY distance_in_km
LIMIT 1
If you wanted to know the closest point on EACH route to your starting point you would calculate the closest point on each route, then join that to your original table where the distance for that point matches the min distance (this will cause a problem if 2 points on a single route are exactly the same distance from your start point)
SELECT route_path.id_route,
route_path.id_point,
route_path.lat,
route_path.lng
FROM route_path
INNER JOIN
(
SELECT id_route,
MIN(DEGREES(ACOS(COS(RADIANS($start_lat))
* COS(RADIANS(lat))
* COS(RADIANS($start_lng) - RADIANS(lng))
+ SIN(RADIANS($start_lat))
* SIN(RADIANS(lat))))) AS distance_in_km
FROM route_path
GROUP BY id_route
) sub0
ON route_path.id_route = sub0.id_route
AND DEGREES(ACOS(COS(RADIANS($start_lat))
* COS(RADIANS(lat))
* COS(RADIANS($start_lng) - RADIANS(lng))
+ SIN(RADIANS($start_lat))
* SIN(RADIANS(lat)))) = sub0.distance_in_km

How can I optimize this stored procedure?

I need some help optimizing this procedure:
DELIMITER $$
CREATE DEFINER=`ryan`#`%` PROCEDURE `GetCitiesInRadius`(
cityID numeric (15),
`range` numeric (15)
)
BEGIN
DECLARE lat1 decimal (5,2);
DECLARE long1 decimal (5,2);
DECLARE rangeFactor decimal (7,6);
SET rangeFactor = 0.014457;
SELECT `latitude`,`longitude` into lat1,long1
FROM world_cities as wc WHERE city_id = cityID;
SELECT
wc.city_id,
wc.accent_city as city,
s.state_name as state,
c.short_name as country,
GetDistance(lat1, long1, wc.`latitude`, wc.`longitude`) as dist
FROM world_cities as wc
left join states s on wc.state_id = s.state_id
left join countries c on wc.country_id = c.country_id
WHERE
wc.`latitude` BETWEEN lat1 -(`range` * rangeFactor) AND lat1 + (`range` * rangeFactor)
AND wc.`longitude` BETWEEN long1 - (`range` * rangeFactor) AND long1 + (`range` * rangeFactor)
AND GetDistance(lat1, long1, wc.`latitude`, wc.`longitude`) <= `range`
ORDER BY dist limit 6;
END
Here is my explain on the main portion of the query:
+----+-------------+-------+--------+---------------+--------------+---------+--------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+--------------+---------+--------------------------+------+----------------------------------------------+
| 1 | SIMPLE | B | range | idx_lat_long | idx_lat_long | 12 | NULL | 7619 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | s | eq_ref | PRIMARY | PRIMARY | 4 | civilipedia.B.state_id | 1 | |
| 1 | SIMPLE | c | eq_ref | PRIMARY | PRIMARY | 1 | civilipedia.B.country_id | 1 | Using where |
+----+-------------+-------+--------+---------------+--------------+---------+--------------------------+------+----------------------------------------------+
3 rows in set (0.00 sec)
Here are the indexes:
mysql> show indexes from world_cities;
+--------------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| world_cities | 0 | PRIMARY | 1 | city_id | A | 3173958 | NULL | NULL | | BTREE | |
| world_cities | 1 | country_id | 1 | country_id | A | 23510 | NULL | NULL | YES | BTREE | |
| world_cities | 1 | city | 1 | city | A | 3173958 | NULL | NULL | YES | BTREE | |
| world_cities | 1 | accent_city | 1 | accent_city | A | 3173958 | NULL | NULL | YES | BTREE | |
| world_cities | 1 | idx_pop | 1 | population | A | 28854 | NULL | NULL | YES | BTREE | |
| world_cities | 1 | idx_lat_long | 1 | latitude | A | 1057986 | NULL | NULL | YES | BTREE | |
| world_cities | 1 | idx_lat_long | 2 | longitude | A | 3173958 | NULL | NULL | YES | BTREE | |
| world_cities | 1 | accent_city_2 | 1 | accent_city | NULL | 1586979 | NULL | NULL | YES | FULLTEXT | |
+--------------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
8 rows in set (0.01 sec)
The function you see in the query I wouldn't think would cause the slow down, but here is the function:
CREATE DEFINER=`ryan`#`%` FUNCTION `GetDistance`(lat1 numeric (9,6),
lon1 numeric (9,6),
lat2 numeric (9,6),
lon2 numeric (9,6) ) RETURNS decimal(10,5)
BEGIN
DECLARE x decimal (20,10);
DECLARE pi decimal (21,20);
SET pi = 3.14159265358979323846;
SET x = sin( lat1 * pi/180 ) * sin( lat2 * pi/180 ) + cos(
lat1 *pi/180 ) * cos( lat2 * pi/180 ) * cos( (lon2 * pi/180) -
(lon1 *pi/180)
);
SET x = atan( ( sqrt( 1- power( x, 2 ) ) ) / x );
RETURN ( 1.852 * 60.0 * ((x/pi)*180) ) / 1.609344;
END
As far as I can tell there is not something directly wrong with your logic that would make this slow, so the problems ends up being that you can't use any indexes with this query.
MySQL needs to do a full table scan and apply the functions of your WHERE clause to each row to determine if it passed the conditions. Currently there's 1 index used: idx_lat_long.
It's a bit of a bad index, the long portion will never be used, because the lat portion is a float. But at the very least you managed to effectively filter out all rows that are outside the latitude range. But it's likely.. these are still a lot though.
You'd actually get slightly better results on the longitude, because humans only really live in the middle 30% of the earth. We're very much spread out horizontally, but not really vertically.
Regardless, the best way to further minimize the field is to try to filter out as many records in the general area. Right now it's a full vertical strip on the earth, try to make it a bounding box.
You could naively dice up the earth in say, 10x10 segments. This would in a best case make sure the query is limited to 10% of the earth ;).
But as soon as your bounding box exceeds to separate segments, only the first coordinate (lat or lng) can be used in the index and you end up with the same problem.
So when I thought of this problem I started thinking about this differently. Instead, I divided up the earth in 4 segments (lets say, north east, north west, south east, south west on map). So this gives me coordinates like:
0,0
0,1
1,0
1,1
Instead of putting the x and y value in 2 separate fields, I used it as a bit field and store both at once.
Then every 1 of the 4 boxes I divided up again, which gives us 2 sets of coordinates. The outer and inner coordinates. I'm still encoding this in the same field, which means we now use 4 bits for our 8x8 coordinate system.
How far can we go? If we assume a 64 bit integer field, it means that 32bit can be used for each of the 2 coordinates. This gives us a grid system of 4294967295 x 4294967295 all encoded into one database field.
The beauty of this field is that you can index it. This is sometimes called (I believe) a Quad-tree. If you need to select a big area in your database, you just calculate the 64bit top-left coordinate (in the 4294967295 x 4294967295 grid system) and the bottom-left, and it's guaranteed that anything that lies in that box, will also be within the two numbers.
How do you get to those numbers. Lets be lazy and assume that both our x and y coordinate have range from -180 to 180 degrees. (The y coordinate of course is half that, but we're lazy).
First we make it positive:
// assuming x and y are our long and lat.
var x+=180;
var y+=180;
So the max for those is 360 now, and (4294967295 / 360 is around 11930464).
So to convert to our new grid system, we just do:
var x*=11930464;
var y*=11930464;
Now we have to distinct numbers, and we need to turn them into 1 number. First bit 1 of x, then bit 1 of y, bit 2 of x, bit 2 of y, etc.
// The 'morton number'
morton = 0
// The current bit we're interleaving
bit = 1
// The position of the bit we're interleaving
position = 0
while(bit <= latitude or bit <= longitude) {
if (bit & latitude) morton = morton | 1 << (2*position+1)
if (bit & longitude) morton = morton | 1 << (2*position)
position += 1
bit = 1 << position
}
I'm calling the final variable 'morton', the guy who came up with it in 1966.
So this leaves us finally with the following:
For each row in your database, calculate the morton number and store it.
Whenever you do a query, first determine the maximum bounding box (as the morton number) and filter on that.
This will greatly reduce the number of records you need to check.
Here's a stored procedure I wrote that will do the calculation for you:
CREATE FUNCTION getGeoMorton(lat DOUBLE, lng DOUBLE) RETURNS BIGINT UNSIGNED DETERMINISTIC
BEGIN
-- 11930464 is round(maximum value of a 32bit integer / 360 degrees)
DECLARE bit, morton, pos BIGINT UNSIGNED DEFAULT 0;
SET #lat = CAST((lat + 90) * 11930464 AS UNSIGNED);
SET #lng = CAST((lng + 180) * 11930464 AS UNSIGNED);
SET bit = 1;
WHILE bit <= #lat || bit <= #lng DO
IF(bit & #lat) THEN SET morton = morton | ( 1 << (2 * pos + 1)); END IF;
IF(bit & #lng) THEN SET morton = morton | ( 1 << (2 * pos)); END IF;
SET pos = pos + 1;
SET bit = 1 << pos;
END WHILE;
RETURN morton;
END;
A few caveats:
The absolute worst case scenario will still scan 50% of your entire table. This chance is extremely low though, and I've seen absolutely significant performance increases for most real-world queries.
The bounding box in this case assumes a Eucllidean space, meaning.. a flat surface. In reality your bounding boxes are not exact squares, and they warp heavily when getting closer to the poles. By just making the boxes a bit larger (depending on how exact you want to be) you can get quite far. Most real-world data is also often not close to the poles ;). Remember that this filter is just a 'rough filter' to get the most of the likely unwanted rows out.
This is based on a so-called Z-Order curve. To get even better performance, if you're feeling adventurous.. you could try to go for the Hilbert Curve instead. This curve oddly rotates, which ensures that in a worst case scenario, you will only scan about 25% of the table.. Magic! In general this one will also filter much more unwanted rows.
Source for all this: I wrote 3 blogposts about this topic when I came to the same problems and tried to creatively get to a solution. I got much better performance with this compared to MySQL's GEO indexes.
http://www.rooftopsolutions.nl/blog/229
http://www.rooftopsolutions.nl/blog/230
http://www.rooftopsolutions.nl/blog/231

Geolocation distance SQL from a cities table [duplicate]

This question already has answers here:
Calculate distance between 2 GPS coordinates
(31 answers)
Closed 3 years ago.
So I have this function to calculate nearest cities based on latitude, longitude and radius parameters.
DELIMITER $$
DROP PROCEDURE IF EXISTS `world_db`.`geolocate_close_cities`$$
CREATE PROCEDURE `geolocate_close_cities`(IN p_latitude DECIMAL(8,2), p_longitude DECIMAL(8,2), IN p_radius INTEGER(5))
BEGIN
SELECT id, country_id, longitude, latitude, city,
truncate((degrees(acos( sin(radians(latitude))
* sin(radians(p_latitude))
+ cos(radians(latitude))
* cos(radians(p_latitude))
* cos(radians(p_longitude - longitude) ) ) )
* 69.09*1.6),1) as distance
FROM cities
HAVING distance < p_radius
ORDER BY distance desc;
END$$
DELIMITER ;
Here's the structure of my cities table:
> +------------+-------------+------+-----+---------+----------------+ |
> Field | Type | Null | Key | Default | Extra |
> +------------+-------------+------+-----+---------+----------------+ |
> id | int(11) | NO | PRI | NULL | auto_increment | |
> country_id | smallint(6) | NO | | NULL | | |
> region_id | smallint(6) | NO | | NULL | | |
> city | varchar(45) | NO | | NULL | | |
> latitude | float | NO | | NULL | | |
> longitude | float | NO | | NULL | | |
> timezone | varchar(10) | NO | | NULL | | |
> dma_id | smallint(6) | YES | | NULL | | |
> code | varchar(4) | YES | | NULL | |
> +------------+-------------+------+-----+---------+----------------+
It works very well.
What i'd lke to do (pseudcode) is something like:
SELECT * FROM cities WHERE DISTANCE(SELECT id FROM cities WHERE id={cityId}, {km))
and it'll return me the closest cities.
Any ideas of how I can do this?
At the moment, I just call the function, and then iterate through the ids into an array and then perform a WHEREIN in the city table which obviously isn't very efficient.
Any help is MUCH appreciated. Thanks.
If you can limit the maximum distance between your cities and your local position, take advantage of the fact that one minute of latitude (north - south) is one nautical mile.
Put an index on your latitude table.
Make yourself a haversine(lat1, lat2, long1, long2, unit) stored function from the haversine formula shown in your question. See below
Then do this, given mylatitude, mylongitude, and mykm.
SELECT *
from cities a
where :mylatitude >= a.latitude - :mykm/111.12
and :mylatitude <= a.latitude + :mykm/111.12
and haversine(:mylatitude,a.latitude,:mylongitude,a.longitude, 'KM') <= :mykm
order by haversine(:mylatitude,a.latitude,:mylongitude,a.longitude, 'KM')
This will use a latitude bounding box to crudely rule out cities that are too far away from your point. Your DBMS will use an index range scan on your latitude index to quickly pick out the rows in your cities table that are worth considering. Then it will run your haversine function, the one with all the sine and cosine maths, only on those rows.
I suggest latitude because the on-the-ground distance of longitude varies with latitude.
Note this is crude. It's fine for a store-finder, but don't use it if you're a civil engineer -- the earth has an elliptical shape and the this assumes it's circular.
(Sorry about the 111.12 magic number. That's the number of km in a degree of latitude, that is in sixty nautical miles.)
See here for a workable distance function.
Why does this MySQL stored function give different results than to doing the calculation in the query?