I have a database that I populate with specific locations of my choice. For each location, I will provide a longitude and latitude.
I want to get the geoIP (the longitude and latitude for a given IP matched in a database like maxmind.com) of a visitor to my website. Using the geoIP, I want to find the closest location to the visitor from my locations database table.
I've been spending a lot of time trying to figure out how to accomplish this in an efficient manner; I don't want the process to be expensive for every visitor. The process doesn't have to be precise, just precise enough. If it gives a user Sacramento instead of San Francisco, which would have been more correct, that's okay. Just as long as the degree of error is small enough that it wont bother a majority of users. What wouldn't be okay is if the algorithm gave a location completely off and irrelevant from where they are, say like Chicago (when they live in California).
So with that said, what are some solutions?
Here are some of my ideas:
1
Use the pythagorean theorem to find the distances between two points in a plane. The only problem is that the earth is not on a planar coordinate system, but rather a spherical one. So, I'll need a way to convert geolocation data (long. and lat.) into X and Y coordinates. Then I could run a SQL query that finds the record with the shortest distance which is calculated using: sqrt(abs(locationX - geoIpX)^2 + abs(locationY - geoIpY)^2)
I'm not sure if this is a plausible solution. If it is, then please smooth out the rough edges for me so I can implement it.
2
Figure out an algorithm that uses the differences in longitudes and latitudes. For example, first find the locations that has the closest longitude to the geoIP's longitude, then find the location out of that set that has the closest latitude to the geoIP's latitude. The only issue with the algorithm just described is that the margin of error can potentially be huge. For example, say a city in California is 3 degrees longitude away from the geoIP, but a city in Canada's longitude is 2 degrees away, what will happen is that this longitude line would be used to find the closest latitude, which may be only the city in Canada. Despite the fact that the latitude in Canada is many more degrees further than the latitude in the city in California, the user will be presented with data relevant to a Canadian instead of a Californian, which would be too much of an error. However, maybe there are some modifications to this algorithm that may fix this?
afterword
Thanks for reading. All solutions and help are highly appreciated! :)
You can use Haversine formula.
http://www.scribd.com/doc/2569355/Geo-Distance-Search-with-MySQL
For optimizing search you can provide max , min coordinates
For max min coords youy can use a server side function to process or you can use stored proceduire to calculate those
Example for max min coords :
$minLng = $lng-($distance/(abs(cos(rad2deg($lat))*69)));
$maxLng = $lng+($distance/(abs(cos(rad2deg($lat))*69)));
$minLat = $lat-($distance/69);
$maxLat = $lat+($distance/69);
Mysql +php :
$select = "3956 * 2 * ASIN(SQRT(POWER(SIN((initial.pin_lat - " . $destination.lat . ") * pi()/180 / 2), 2) +COS(" .
"initial.pin_lat * pi()/180) * COS(" . $destination.lat . " * pi()/180) *POWER(SIN((" .
"initial.pin_lng - " . $destination.long . ") * pi()/180 / 2),2))) as distance_to_destination";
Then use this distance_to_destination in having clause
or use the select in where clause
to narrow down the search
$where = $destination.long." between "
. $minLng . " and " . $maxLng . " and ".$destination.lat." between " . $minLat . " and "
. $maxLat . " having distance < " . $prefered_distance
You want the Haversine formula. Since that calculation involves trigonometry functions, it tends to be kind of costly.
You might consider copying part of your data to a companion technology that supports built-in geolocation searches. MySQL isn't so good at this type of search, but some other tools are.
For example, Sphinx Search has support for geo distance searches:
http://sphinxsearch.com/blog/2013/07/02/geo-distances-with-sphinx/
MySQL does support spatial indexes, but only in MyISAM tables. There are many reasons to avoid using MyISAM tables.
mysql is as good as other databases. you can use the spatial index. I have written a php class solving spatial index with a monster curve. You can download my Hilbert curve package at phpclasses. it also uses mercantor projection and I'm using it very successful. other database have native support but you can do it in some hours yourself.
Related
I would like to query for all possible streetnames within a radius of 500 meters of a given point.
Multiple posts are reffering to the google store locator example using the Haversine formula or some version of it.
But I also came across some posts that have a much more simplified solution.
They just treat the points as x,y coordinates by adding to the lat and long variables as seen below.
I was wandering if this would be the fastest way to query mysql without getting really complicated and still get a good result. I don't have a lot of data yet, so I want to know if I am on the right track.
Are there any disadvantages or inaccuracy's by using this method?
What I don't get is how this can be a radius like range, it looks more like a one directional
query?
Distance = 0.1; // Range in degrees (0.1 degrees is close to 11km)
LatN = lat + Distance;
LatS = lat - Distance;
LonE = lon + Distance;
LonW = lon - Distance;
...Query DB with something like the following:
SELECT *
FROM table_name
WHERE
(store_lat BETWEEN LatN AND LatS) AND
(store_lon BETWEEN LonE AND LonW)
You might want to ask this question on the GIS site. I've used this answer myself for similar problems. I can see how your proposed solution might be faster, but note your four points are describing a square not a circle so it would not be considered a "radius".
You could use ElasticSearch for that, MySQL will be always slower than ES.
I'm working on a MYSQL/PHP system where I have the following:
-- a set of latitude, longitude in the form of (lat,lng) stored as text format : (lat1,lng1)#(lat2,lng2)#(lat3,lng3) etc. which is basically a polygon drawn over a googlemap instance stored in the database.
-- a table which stores in a field - a point's coordinates P(plat,plng) which is basically a point where a device is stationed
I need to figure out how many polygons from the first table are within a distance of X kilometers from the point P essentially using MYSQL.
I have come across quite a few Google Map libraries regarding this already, but I intend to resolve this by the quickest method possible - which I assume is via a MYSQL query.
Can anyone please please shed some light regarding this?
I've so far consulted a few examples on geospatial querying - and come up with this :
SELECT user_id, latitude, longitude,
GeomFromText( "POINT(CONCAT_WS(' ',latitude,longitude))" ) AS point,
Contains( GeomFromText( 'POLYGON(-26.167918065075458 28.10680389404297,
- 26.187020810321858 28.091354370117188, -26.199805575765794 28.125,-26.181937320958628 28.150405883789062, -26.160676690299308 28.13220977783203, -26.167918065075458 28.10680389404297)' ) ,
GEOMFromText( "POINT(CONCAT_WS(' ',latitude,longitude))" ) )
FROM user_location
But the problem is it shows a record with lat: -26.136230, long: 28.338850 as well which is way off the polygon's boundaries. Can anyone please guide?
I'm not sure if you want to calculate the distance to the nearest corner of the polygon, boundary of the polygon or some notional central point of it. Either way I think the mathmetical solution to this is to use Pythagoras' theorem to work out the proximity of points.
If you have lat1,lng1 and lat2,lng2 expressed in metres I believe that the distance between them is:
SQRT(POW(ABS(lat1 - lat2),2) + POW(ABS(lng1 - lng2),2))
Using an algorithm similar to this you need to decide whether you want to compare your known lat/lng to a single central point of the polygon or to the points of its corners (three times the work!).
MySQL does have a geospatial extension which could be worth looking at. Unfortunately I don't have experience of it.
Okay, did this - and it works - might help someone:
SELECT user_id,latitude,longitude,
Contains(
PolyFromText( 'POLYGON((-26.167918065075458 28.10680389404297, -26.187020810321858 28.091354370117188, -26.199805575765794 28.125,-26.181937320958628 28.150405883789062, -26.160676690299308 28.13220977783203, -26.167918065075458 28.10680389404297))' ),
PointFromText(concat("POINT(",latitude," ",longitude,")"))
) as contains
FROM user_location
=====
Although I agree on expert's views that PostGIS could be a better option.
Tech used: MySQL 5.1 and PHP 5.3
I am just designing a new database for a site I am writing. I am looking at the best way of now storing Lat and Lng values.
In the past I have been using DECIMAL and using a PHP/MySQL select in the form:
SQRT(POW(69.1 * (fld_lat - ( $lat )), 2) + POW(69.1 * (($lon) - fld_lon) * COS(fld_lat / 57.3 ), 2 )) AS distance
to find nearest matching places.
Starting to read up more on new technologies I am wondering if I should use Spatial Extensions. http://dev.mysql.com/doc/refman/5.1/en/geometry-property-functions.html
Information is quite thin on the ground though and had a question on how to store the data. Instead of using DECIMAL, would I now use POINT as a Datatype?
Also, once stored as a POINT is it easy just to get the Lat Lng values from it in case I want to plot it on a map or should I additionally store the lat lngs as DECIMALS again as well?
I know I should prob use PostGIS as most posts on here say I just don't want to learn a new DB though!
Follow up
I have been playing with the new POINT type. I have been able to add Lat Lng values using the following:
INSERT INTO spatialTable (placeName, geoPoint) VALUES( "London School of Economics", GeomFromText( 'POINT(51.514 -0.1167)' ));
I can then get the Lat and Lng values back from the Db using:
SELECT X(geoPoint), Y(geoPoint) FROM spatialTable;
This all looks good, however the calculation for distance is the bit I need to solve. Apparently MySQL has a place-holder for a distance function but won't be released for a while. In a few posts I have found I need to do something like the below, however I think my code is slightly wrong:
SELECT
placeName,
ROUND(GLength(
LineStringFromWKB(
LineString(
geoPoint,
GeomFromText('POINT(52.5177, -0.0968)')
)
)
))
AS distance
FROM spatialTable
ORDER BY distance ASC;
In this example geoPoint is a POINT entered into the DB using the INSERT above.
GeomFromText('POINT(52.5177, -0.0968)' is a Lat Lng value I want to calculate a distance from.
More Follow-up
Rather stupidly I had just put in the ROUND part of the SQL without really thinking. Taking this out gives me:
SELECT
placeName,
(GLength(
LineStringFromWKB(
LineString(
geoPoint,
GeomFromText('POINT(51.5177 -0.0968)')
)
)
))
AS distance
FROM spatialTable
ORDER BY distance ASC
Which seems to give me the correct distances I need.
I suppose the only thing currently that needs answering is any thoughts on whether I am just making life difficult for myself by using Spatial now or future-proofing myself...
I think you should always use the highest level abstraction easily available. If your data is geospatial, then use geospatial objects.
But be careful. Mysql is the worst geospatial database there is. Its OK for points but all its polygon functions are completely broken - they change the polygon to its bounding rectangle and then do the answer on that.
The worst example that hit me is that if you have a polygon representing Japan and you ask what places are in Japan, Vladivostok gets into the list!
Oracle and PostGIS don't have this problem. I expect MSSQL doesn't and any Java database using JTS as its engine doesn't. Geospatial Good. MySQL Geospatial Bad.
Just read here How do you use MySQL spatial queries to find all records in X radius? that its fixed in 5.6.1.
Hoorah!
Mysql GIS yagni:
If you have no experience with GIS, learning spatial extensions is practically like learning a new database, plus a little math, and a lot of acronyms. Maps, projections, srids, formats... Do you have to learn all that to calculate distances between points given a certain lat/long: probably not, will you be integrating 3rd party GIS data or working with anything more complex than points, what coordinate system will you be using?
Going back to yagni: do things as simple as posible, in this case implement your code in php or with simple SQL. Once you reach a barrier and decide you need spatial, read up on GIS system, coordinate systems, projects, and conventions.
By then, you will probably want PostGIS.
It's a good thing, because then you get to use spatial indexes on your queries. Limit to a bounding box, for example, to limit how many rows to compare against.
If you can affor placing some extra code into your backend, use Geohash.
It encodes a coordinate into a string in a way that prefixes denote a broader area. The longer your string is, the more precision you have.
And it has bindings for many languages.
http://en.wikipedia.org/wiki/Geohash
https://www.elastic.co/guide/en/elasticsearch/guide/current/geohashes.html
I have a list of zipcoded in a MySQL Database together with their Latitude & Longitude data (Column names: ZipCode, Lat, Lon).
Now I have to make a search requests (search for the zipcode) to extract information from a website. When I make this search requests the results include all information within a radius of 50km of the zipcode.
Now, I don't want to make an unnessary high amount of search requests, so I would like to minimize the amount of zipcodes. So I'm looking for a way to filter all zipcodes, so that I have only the zipcodes where the distance between them is >50km.
Unfortunately I have no idea how to to it.
Can someone help me to solve this?
You may be interested in checking out the following presentation:
Geo/Spatial Search with MySQL by Alexander Rubin
The author describes how you can use the Haversine Formula in MySQL to limit your searches to a defined range. He also describes how to avoid a full table scan for such queries, using traditional indexes on the latitude and longitude columns.
You can use the google geocoding api , it allows you to get distances between 2 locations (lat/long, it also allows you to get zip from lat/long). From this you should be able to get the distance between each of your zipcodes and put them into a table, then you can do searches on just these.
Well, I see no other way then to iterate all rows on each request and filter them by calculating distance between selected zipcode and others (all of them), based on Lat & Lon.
I am using something similiar...
http://webarto.com/googlemaps
http://webarto.com/izrada-web-stranica/belgrade
PHP function for distance between two LL...
function distance($lat1, $lon1, $lat2, $lon2){
$theta = $lon1 - $lon2;
$dist = sin(deg2rad($lat1)) * sin(deg2rad($lat2)) + cos(deg2rad($lat1)) * cos(deg2rad($lat2)) * cos(deg2rad($theta));
$dist = acos($dist);
$dist = rad2deg($dist);
$miles = $dist * 60 * 1.1515;
return round($miles * 1.609344,3);
}
I calculate it this way...
$sql = mysql_query("SELECT * FROM geoip WHERE city = '$city'");
while($row = mysql_fetch_array($sql)){
$ll = explode(",",$row["ll"]);
$x = distance(44.5428009033,18.6693992615,$ll[0],$ll[1]);
$road = intval($x+($x/3));
echo "Distance between ".$row["city"]." and Tuzla is ".$x." kilometers of airline, that's about ".$road." kilometers of road way.";
}
Daniel's link deals with selecting all the zip codes within 50km of a given latitude/longitude. Once you can do that, you can build a filtered list of zipcodes like this...
Select a zip code at random and add it to the filtered list
Delete all zip codes which lie within 50km of the selected zip code
Select a new zip code at random from the remaining zip codes, repeat until no more are left.
You know that you're only picking zip codes that are >50km from the ones already picked, and you know that once the original table is empty it must be because all zip codes lie within 50km of at least one of your selected zip codes.
That doesn't guarantee the smallest possible list of zip codes, and the size of the result will depend on the random choices. However, I think that this simple algorithm is likely to be "good enough", and that saving a few searches wouldn't justify the extra effort involved in finding a truly optimal solution.
The problem has been discussed previously here on SO with various solutions
I had a similar problem and I used this solution to find the answer. Not sure if you are using java or some other language but the logic can be used in any programming language
Geo Location API and finding user within a radius
I am reverse engineering a transportation visualization app. I need to find out the latitude for the origin of their data feed. Specifically what XY 0,0 is. The only formulas I have found calculate distance between two points, or location of a bearing/distance.
They use the XY to display a map in a very legacy application. The XY is in FEET.
I have these coordinates:
47.70446615506108, -122.34469839507263: x=1268314, y=260622
47.774182540800616,-122.3412994737105: x=1269649, y=286031
47.60024792289405, -122.32767331735774: x=1271767, y=222532
47.57012494413499, -122.29129609983679: x=1280532, y=211374
I need to find out what the latitude and longitude of x=0, y=0 is and what the formula would be to find this out.
They have two data feeds, one is more current than the other. The feed with the most current data does NOT include latitude, longitude, but only XY. I am trying to extrapolate based on their less current, yet more informative (includes lat, lon) data feed what 0,0 is so I can simply convert their (more current) data feed's XY coordinates to latitude and longitude.
If you look at the first 2 lines of data, and subtract the latitude
47.7044 - 47.7741 = -0.06972 degrees
There are 60 nautical miles per degree of latitude, and 6076 feet per nautical mile.
-.06972 * 60 * 6076 = 25,415 ft
Subtracting the two 'Y' values:
260662 - 286031 = 25,409 ft
So indeed that seems to prove the X and Y values are in feet.
If you take any of the Y values, and convert back to degrees, for example
260622 ft / ( 6076 ft/nm ) / ( 60 nm/degree ) = .71
286031 ft / 6076 / 60 = .78
So subtracting those values from the latitudes of (47.70 and 47.77) gives you very close to exactly 47 degrees, which should be your y=0 point.
For longitude, a degree is 60 nautical miles at the equator and 0 miles at the poles. So the number of miles per degree has to be multiplied by the cosine of the latitude, so approx cos(47 degrees), or .68. So instead of 6076 nm per degree, it's about 4145 nm.
So for the X values,
1268314 ft / ( 4145 ft/nm ) / ( 60 nm/degree ) = 5.10 degrees
1269649 ft / 4145 / 60 = 5.10 degrees
These X numbers increase as the latitude increases (less negative), so I believe you should add 5.1 degrees, which means the X base point is about
-122.3 + 5.1 = 117.2 West longitude for your x=0 point.
This is roughly the position of Spokane WA.
So given X=1280532, Y=211374
Lat = 47 + ( 211374 / 6096 / 60 ) = 47.58
Lon = -117.2 - ( 1280532 / ( 6096 * cos(47.58)) / 60 ) = -122.35
Which is roughly equivalent to the given data 47.57 and -122.29
The variance may be due to different projections - the X,Y system may be a "flattened" projection as opposed to lat/long which apply to a spherical projection? So to be accurate you may yet need more advanced math or that open source library :)
This question may also be helpful, it contains code for calculating great circle distances:
Calculate distance between two latitude-longitude points? (Haversine formula)
There are many different coordinate systems. You need to find out the what the coordinate systems are for both the lat/lon's (e.g. WGS84 etc) and x/y's first (e.g. some sort of projected system probably).
Once you have that information there are several tools you can use to do conversions and manipulations. One example (of a free open source coding library) is proj4.
Ask them what coordinate system they're using! (or if you got the dataset from some database, look at the metadata for the dataset and it should tell you. Otherwise I'd be skeptical of its value)
Most likely this is one of the state plane coordinate systems. They're for localized areas of the earth (kind of like UTM), and are frequently used for surveying.
You can use CORPSCON (or other GIS programs; ExpertGPS will do this if you have the GIS Option Pack but it's not free. I forget whether GPSBabel does conversion) to convert between lat/long and any of the state plane coordinate systems. You'll also need to know which datum the coordinates are in. WGS84 and NAD83 are very close but NAD27 is different.
You've got good advice on coordinate systems already, so I'll just chime in with the library I've used with great success in the past.
Geotrans is approved for use by the US Department of Defence, so you can be sure that it is well tested. You can grab it from here:
http://earth-info.nga.mil/GandG/geotrans/index.html
That might not be the right link as that page talks about the application, not the library. I expect the library is in the Developers package. Licensing terms were very liberal from memory, but make sure you review the terms before using it commercially.
Edit:
An interesting discussion on Geotrans licensing can be found here:
http://www.mail-archive.com/debian-legal#lists.debian.org/msg39263.html
Over here, I said this:
In Java, I would use the OpenMap converter from a point's expression in UTM to one using Latitude and Longitude (assuming a WGS-84 ellipsoid which is most commonly used in GPS).
OpenMap is open source and I would post a link to their download page but they have a short license script in the way. So, to avoid being rude, I won't deep link. Instead, head to their homepage and click Downloads.
That should either solve your problem directly or at least point you towards a useful algorithm.
I've used Brenor Brophey's gPoint PHP class to do this on a couple of occasions. Solid results, GPL code, and easily deployed. Recommended.