So I have thousands of users with latitude and longitude. They check in with new coordinates every 30 seconds.
When they check in I need to send them the 100 people closest to them no matter how far away they are. In a crowded city this may be a radius of half mile. In the country it could take a radius of 100 miles to get 100 people.
It's easy enough to calculate the distance of each user from the user checking in and then do LIMIT 100. But that essentially does a table scan, calculates the distance between the checking in user and all other users in the table, sorts them by distance and then takes 100.
Won't be efficient at scale.
So what strategy can I use to scope the query to a subset of users and still get 100 results?
I don't think MySQL will be helpful for a longer duration. I'd recommend checking out the SingleStore database for your use case since it's efficient, scalable, and faster.
For your reference, Please go through the documentation by clicking the link here.
Related
I have a MySQL database table with a series of points, which is a specific geometry data type (basically, a lat/lon coordinate). I need to get all the points that are close to some coordinates (close meaning to less that 1 km, let's say).
Now I am getting ALL points from the table and, from PHP, calculating which of them have a distance shorter than the desired distance (1 km).
The problem is that there a thousands of points, so the performance is very poor.
Is there a way to get only those close points directly from the database? I don't guess what function may help me.
Thank you!
I have data of locations of thousands of sensors in MySQL. I want to identify the sensor closest to the user's location and show that specific sensor's data. All the location data is available as lat lng.
I understand that one approach can be to find displacements between the origin and all the sensors using Haversine formula and select the one with the shortest distance. The problem here is that there are tens of thousands of sensors.
Any suggestions/leads?
Spatial index allows efficient query of points within any specific distance. The problem of course is one might not know the search radius needed in specific case. Unfortunately, a large radius causes inefficient queries, and a small radius might result in no match at all.
A possible solution is to search with increasing radius, until the search returns some results, and then find the closest result among those.
This article describes this solution for BigQuery, would require some adaptation for MySQL script dialect:
https://mentin.medium.com/nearest-neighbor-using-bq-scripting-373241f5b2f5
Not the MySQL answer you are looking for but Postgresql's popular PostGIS extension has an inbuilt K Nearest Neighbor operator class). Also, see its documentation. It works great!
Also, I am aware of this Go library that allows you to do KNN in memory after building a Quadtree with your sensor locations.
For only thousands, a simple bounding box with two 2-column indexes may be fast enough.
For better speed, see SPATIAL indexing.
For details on those two solutions, plus two faster ones, see Find Nearest
I am working with PHP and use MySQL for database. I need a way, to get 5 closest coordinates to a given coordinate from database, which is very fast and at least 80-90% accurate. I have researched a lot. I found havershine formula, spherical law of cosines, bounding square method to compare min and max latitude-longitude values with coordinate in database and other methods which use trigonometric math functions. But all these formulas take a long to return result in database with thousands of entries. Does MySQL provide any function to do it fast?
See this similar question on the GIS Stack site. The performance of your ultimate solution will depend on how many targets are in the reference table you are searching and if you can limit the distance you are interested in (such as closest 5 within 30 miles). I don't think you can reliably optimize the process; you need to calculate the distance for all coordinates in your reference table.
I have 451 cities with coordinates. Now I want to calculate the distance between each city and then order some results by that distance. Now I have 2 options:
I can run a loop that would calculate distance for every possible combination of cities and storing them into a table, which would result in roughly 200k rows.
Or, I can leave the cities without pre-calculating and then, when results are displayed (about 30 per page), and calculate the distance for each city separately.
I don't know which would be better for performance, but I would prefer going for option one, in which case I have another concern: Is there a way I could get out as little rows as possible? Currently, I would count the possibilities as 451^2, but I think I could divide that by 2, since the distance in case of City1-City2 is the same as City2-City1.
Thanks
If your table of cities is more or less static, then you should definitely per-calculate all distances and store them in a separate table. In this case you will have (451^2/2) rows (just make sure thet id of City1 is always lower then id of City2 (or another way round, doesn't really matter)).
Normally the cost of a single MySQL query is quite high and the cost of mathematical operations really low. Especially if the scale of your map is small and the required precision is low, so you can calculate with a fixed distance between degrees, you will be faster with calculating.
Furthermore you would have a problem if the number of cities rises because of a change in your project and therefore the number of combinations you'd have to store in the DB exceeds the limits.
So you'd probably better off without pre-calculating.
I have a table with zipcode(int) and Location(point). I'm looking for a MySql query or function. Here is an example of the date. I'd like to return 100 miles.
37922|POINT(35.85802 -84.11938)
Is there an easy query to achieve this?
Okay so I have this
select x(Location), Y(Location) FROM zipcodes
This will give me my two points, but how do i figure out whats within a distance of x/y?
The query to do this is not too hard, but is slow. You would want to use the Haversine formula.
http://en.wikipedia.org/wiki/Haversine_formula
Converting that to SQL should not be too difficult, but calculating the distance for every record in a table gets costly as the data set increases.
The work can be significantly reduced by using a geohash function to limit the locus of candidate records. If accuracy is important, the Haversine formula can be applied to the records inside a geohash region.
If the mysql people never completed their GIS and Spatial extension, consider using ElasticSearch or MongoDB.
There is a pretty complete discussion here:
Formulas to Calculate Geo Proximity