Sort by distance, how do you think I should do this? - mysql

My Company currently runs a listing service of family activities. In our CMS we have two types of entities Branches (The shops we list) and Events (Special Offers, Occasions etc).
Typically when listing an event we would say which Branches it is for and create a relationship, we would search the near by shops for events. Grab them and sort them by distance.
Now our clients want to be able to list a one off event that hasn't got a branch associated with it (For example they host a Festival at a near by garden center rather than one of their shops), I can easily make it I can sort these by distance as well.
But what I was wondering is how could combine the both, so one of our apps could go to our API, "Dude, where are 10 events near to whee I am right now ? " and the api would pull up a list of the 10 closest events.
It should be able to handle Events that are using the location of Branches as well as having its own unique location.
Or do you think I should just store location as its own entity or have hidden branches, places we can set up as being where the event is happening but don't actually show up as being a branch in the app :)

If you have lat / long positions for your events and your branches you can apply the Haversine Formula to compute approximate distances, then order by ascending distance.
MySQL can do this, if you're willing to use a hairy query. This note from the Google Maps team gives the query. You don't have to use Google Maps to do this; you just need lat/long information for each place involved.
https://developers.google.com/maps/articles/phpsqlsearch_v3
Edit It's true that this is very slow if you compute the distance between many pairs of places. The trick to making this kind of operation fast is using a bounding box (spherectangular) distance limit, and putting indexes on your latitude and longitude.
Look at this: Geolocation distance SQL from a cities table

MYSQL has support for "spacial databases" as the spacial extension This will allow you to use "spacial" datatypes in your columns, as well as build index on them, and perform various "spacial analysis" such as polygon intersection.
Not sure this is what you need, but that may worth investigations.

Related

Given a user's lat lng, how to find the nearest lat lng from a database of thousands of lat lng?

I have data of locations of thousands of sensors in MySQL. I want to identify the sensor closest to the user's location and show that specific sensor's data. All the location data is available as lat lng.
I understand that one approach can be to find displacements between the origin and all the sensors using Haversine formula and select the one with the shortest distance. The problem here is that there are tens of thousands of sensors.
Any suggestions/leads?
Spatial index allows efficient query of points within any specific distance. The problem of course is one might not know the search radius needed in specific case. Unfortunately, a large radius causes inefficient queries, and a small radius might result in no match at all.
A possible solution is to search with increasing radius, until the search returns some results, and then find the closest result among those.
This article describes this solution for BigQuery, would require some adaptation for MySQL script dialect:
https://mentin.medium.com/nearest-neighbor-using-bq-scripting-373241f5b2f5
Not the MySQL answer you are looking for but Postgresql's popular PostGIS extension has an inbuilt K Nearest Neighbor operator class). Also, see its documentation. It works great!
Also, I am aware of this Go library that allows you to do KNN in memory after building a Quadtree with your sensor locations.
For only thousands, a simple bounding box with two 2-column indexes may be fast enough.
For better speed, see SPATIAL indexing.
For details on those two solutions, plus two faster ones, see Find Nearest

How do I make custom location queries using Google Maps APIs?

I have a list of PlaceIds from Google Maps that I got from querying an external list of location names.
I now have a "search" feature where you enter your zip code, and it finds the closest 10 locations and shows them on a map. As far as I can tell, Google only lets you search by a search term.
How do I limit my search results only to the list of PlaceIds that I have present, and order it by distance?
NOTE: I don't have a "keyword" to search for that will include all my PlaceIds. These were retrieved from an external source, and there is no "common search keyword" that will return a super-set of my PlaceIds.
Restricting the results of a PlacesSearch in that way currently is not possible.
Possible approach:
You'll also need a Database of ZIP-Codes(including the locations).
When your DB supports spatial queries you may use ST_DISTANCE() to order the results and use the 10 nearest results.
When your DB doesn't support spatial queries you may use the haversine formula to calculate the distance
Of course the result may only be approximate, because a ZIP-code usually is related to an area, not to a location

Handling multiple geofence in google map

I have around 100 geofences (polygons) defined and stored in DB. My tracking devices updates it location once a minute. What could be the best way to check a given LatLng is in any of these geofences? I want to trigger alert when the device in any of these geofences.
What I could think is, in each minute after receiving the location from tracking device, I have to query geofence information from DB or array and compare one at a time. But this seems computationally expensive.
Any idea and help, please..
Assuming that the stored geo-fences are relatively static (i.e. not modified/added/deleted frequently) you could trade storage space for point look-up time by choosing to represent your geo-fences with a suitable spatial data structure.
R-Trees (https://en.wikipedia.org/wiki/R-tree) for example could be used to store which geo-fences might be applicable to a given point location so that only a subset of those fences need to be checked to determine if the point lies within them.
Pragmatically, you are likely best off using already existing spatially enabled databases like PostgreSQL+PostGIS (http://postgis.net/) which allow you efficiently post queries based on spatial relations (in your application likely ST_Within or ST_Contains)

How good is the geography datatype in sql server 2008?

I have a large database full of customers, implemented in sql server 2005. Customers each have a latitude and longitude, represented as Decimal(18,15). The most important search query in the database tries to find all customers close to a certain location like this:
(Addresses.Latitude - #SearchInLat) BETWEEN -1 * #LatitudeBound AND #LatitudeBound)
AND ( (Addresses.Longitude - #SearchInLng) BETWEEN -1 * #LongitudeBound AND #LongitudeBound)
So, this is a very simple method. #LatitudeBound and #LongitudeBound are just numbers, used to pull back all the customers within a rough bounding rectangle of the point #SearchInLat, #SearchInLng. Once the results get to a client PC, some results are filtered out so that there is a bounding circle rather than a rectangle. (This is done on the client PC to avoid calculating square roots on the server.)
This method has worked well enough in the past. However, we now want to make the search do more interesting things - for instance, having the number of results pulled back be more predictable, or for the user to dynamically increase the size of the search radius. To do this, I have been looking at the possibility of ugprading to sql server 2008, with its Geography datatype, spatial indexes, and distance functions. My question is this: how fast are these?
The advantage of the simple query we have at the moment is that it is very fast and not performance intensive, which is important as it is called very often. How fast would a query based around something like this:
SearchInPoint.STDistance(Addresses.GeographicPoint) < #DistanceBound
be by comparison? Do the spatial indexes work well, and is STDistance fast?
If your handling just a standard Lat/Lng pair as you describe, and all your doing is a simple lookup, then arguably your not going to gain much in the way of a speed increase by using the Geometry Type.
However, if you do want to get more adventurous as you state, then swapping to using the Geometry types will open up a whole world of new possibilities for you, and not just for searches.
For example (Based on a project I'm working on) you could (If it's uk data) download the polygon definitions for all the towns / villages / city's for a given area, then do cross references to search in a particular town, or if you had a road map, you could find which customers lived next to major delivery routes, motorways, primary roads all sorts of things.
You could also do some very fancy reporting, imagine a map of towns, where each outline was plotted on a map, then shaded in with a colour to show density of customers in an area, some simple geometry SQL will easily return you a count straight from the database, to graph this kind of information.
Then there's tracking, I don't know what data you handle, or why you have customers, but if your delivering anything, feeding the co-ordinates of a delivery van in, tells you how close it is to a given customer.
As for the Question is STDistance fast? well that's difficult to say really, I think a better question is "Is it fast in comparison to.....", it's difficult to say yes or no, unless you have something to compare it to.
Spatial Indexes are one of the primary reasons for moving your data to geographically aware database they are optimised to produce the best results for a given task, but like any database, if you create bad indexes, then you will get bad performance.
In general you should definitely see a speed increase of some sort, because the maths in the sorting and indexing are more aware of the data's purpose as opposed to just being fairly linear in operation like a normal index is.
Bear in mind as well, that the more beefy the SQL server machine is, the better results you'll get.
One last point to mention is management of the data, if your using a GIS aware database, then that opens the avenue for you to use a GIS package such as ArcMap or MapInfo to manage, correct and visualise your data, meaning corrections are very easy to do by pointing, clicking and dragging.
My advice would be to create a side by side table to your existing one, that is formatted for spatial operations, then write a few stored procs and do some timing tests, see which comes out the best. If you have a significant increase just on the basic operations your doing, then that's justification alone, if it's about equal then your decision really hinges on, what new functionality you actually want to achieve.

Mysql Database design - Storing single and multi lat longs

I have a mysql database table that will store locations for buildings and events and a few other things. All locations are stored in one table, and linked to buildings, events etc through their own many to many table. That way I can just display dots on a map, and also allow filtering etc.
However the problem comes with some things having a single location so 1 lat,long but some like a track has a number of lat long positions, and something like a large stadium might have a polygon over it. These are also stored as a list of lat,longs with the first and last being the same.
Im wondering how I should store this in the mysql db though. Originally I just had a column for lat, long and id for the lookup table. Should I have ANOTHER lookup table for the co-ordinates or serialise the data before putting it into the DB in some way or should I just store the whole string in one field
lat1,long1
lat1,long1;lat2,long2;lat1,long1
Any suggestions?
I wouldn't de-normalize the data from the start, by pushing a whole "serialized" polygon into a single field.
Rather, I'd have a Polygons table (with polygon ID and possibly auxiliary per-polygon info, such as whether it's an actual closed polygon or just a polyline -- though that might alternatively be represented in the following table by having the last point equal to the first one for a certain polygon), and a PointsInPolygon table (with coordinates of the point, polygon ID foreign key, vertex number within polygon -- the latter two jointly being unique).
Normalization will make (as usual) your life much simpler for ad-hoc queries (including in this case "polygons near X", point in polygon, etc). Again as usual, you can later add redundant denormalized values if and when you determine that some specific query really needs to get optimized (at some cost to table updates, integrity checks, etc). Geodata are not all that different from other kinds in this regard.
Since you're not doing any lookups on the locations, and you're using (I'm assuming) Google Maps API, the simplest solution would probably be to encode a list of lat/lon as JSON and store in a varchar column.
You can just output the JSON straight from the database for your Google Maps API code to use. I would suggest you to use some simple JSON structure like so: ["point",1.23456,2.34567] or ["line",1.23456,2.34567,3.45678,4.56789] and so on.