Optimizing nearby locations MYSQL query via Spatial Indexes - mysql

I am trying to optimize my SQL query to display the nearest locations.
Originally I was using the following query
SELECT name,path, ( 6371 * acos( cos( radians(-36.848461) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians(174.763336) ) + sin( radians(-36.848461) ) * sin( radians( latitude ) ) ) ) AS distance FROM cityDB HAVING distance < 200 AND path IS NOT NULL
This query uses the longitude / latitude values to calculate the distance and takes 14.5 seconds to complete (too slow!)
I created a spatial index (point) of the longitude / latitude values in an effort to speed up the query.
SELECT DISTINCT name,
(ST_Length(ST_LineStringFromWKB(
LineString(
pt,
ST_PointFromText('POINT(174.763336 -36.848461)', 4326)))))
AS distance
FROM CityDB
ORDER BY distance ASC LIMIT 99
However this is averaging 23seconds even longer than the first query which really surprised me!
Is there something else I should be doing with my query to speed it up?
The only thing that I got to get it below 1 second was to add this (thanks to this post Improving performance of spatial MySQL query)
longitude BETWEEN longpoint - (50.0 / (111.045 * COS(RADIANS(latpoint))))
AND longpoint + (50.0 / (111.045 * COS(RADIANS(latpoint))))
However there are a couple of bugs in the code. If I chose Fiji, it will only display locations < 180 latitude, If I chose Vanuatu it will only choose locations < -180 latitude.
SELECT DISTINCT name, (ST_Length
(ST_LineStringFromWKB(LineString( pt, ST_PointFromText('POINT(-179.276277 -18.378639)', 4326)))))
AS distance FROM CityDB
WHERE longitude BETWEEN -179.276277 - (50.0 / (111.045*COS(RADIANS(-18.378639)))) AND -179.276277 + (50.0 / (111.045 * COS(RADIANS(-18.378639))))
GROUP BY (name) ORDER BY distance ASC LIMIT 99
In addition to this, it will also display locations in Russia which is miles away (I know that we can add a limit to distance but there must be a better way than this)
Any other efficient ways to make such a query?

Related

How to optimize complex calculation query execution time?

I have a query like this :
SELECT *, (
6371 * acos (
cos ( radians(33.577718) )
* cos( radians( `Latitude` ) )
* cos( radians( `Longitude` ) - radians(115.846524) )
+ sin ( radians(33.577718) )
* sin( radians( `Latitude` ) )
)
) AS `distance`
FROM `geopc_cn_places_grouped`
WHERE `Latitude`!=33.577718 AND `Longitude`!=115.846524
HAVING `distance` < 200
ORDER BY `distance` ASC
LIMIT 30;
The query execution is always somewhere between 3.5 and 4 seconds.
I have applied a composite index to Latitude and Longitude by running ALTER TABLE geopc_cn_places_grouped ADD INDEX index_Longitude_Latitude(Longitude, Latitude);, but it doesn't reduce the execution time.
I want to know why it's running slow and what possible optimizations can be done.
The slow query log message shows this
and this is the EXPLAIN SELECT query
Table Structure...
and lastly, here is the table index list
Your query as written isn't sargable. That is, it cannot exploit any index. So, each time you run it, you use that big spherical cosine law formula for every row in your table. It's a full table scan. It's likely that most of your slowness comes from the table scan, because modern computers do the math pretty quickly once they have the data in RAM.
But, you're in luck. Your search looks for points within a 200 statute mile radius of your candidate point. That means you can usea WHERE ... BETWEEN clause to eliminate points that are more than 200 miles south or north (latitude) of your starting point.
To do this you need to know there are 69.0 statute miles, 60 nautical miles, and 111.045 km in each degree of latitude. Therefore you should search for point ± (200/69) So.... try a query like this.
SELECT *, (
6371 * acos (
cos ( radians(33.577718) )
* cos( radians( `Latitude` ) )
* cos( radians( `Longitude` ) - radians(115.846524) )
+ sin ( radians(33.577718) )
* sin( radians( `Latitude` ) )
)
) AS `distance`
FROM `geopc_cn_places_grouped`
WHERE `Latitude`!=33.577718 AND `Longitude`!=115.846524
AND Latitude BETWEEN 33.577718 - (200/69) AND 33.577718 + (200/69)
HAVING `distance` < 200
ORDER BY `distance` ASC
LIMIT 30;
Then create an index on your Latitude column.
CREATE INDEX latsearch ON geopc_cn_places_grouped(Latitude);
The Latitude BETWEEN clause I suggest will then do an index range scan and so skip many of the rows in your table. That's the classic SQL way of making queries faster.
This is a simplification of the ideal answer to this question. I wrote up this problem here.
Your query must compute the distance for every row. The quick solution is to use a "bounding box". This limits the number of rows to test to a latitude stripe or longitude stripe.
Details (and more advanced speedups): http://mysql.rjweb.org/doc.php/find_nearest_in_mysql

Optimize my nearby locations MYSQL query

I have a query which is causing my site to run very slow. Essentially it identifies the nearest location to a longitude / latitude point from a database of around 2m records.
Currently this query takes 7seconds to complete.
I have done the following to speed it up (before it was more than 15 seconds!)
Added index keys to name / longitude / latitude / path
stored the path in the database so that it does not need to run
Stored results into another table so we do not have to run the query again.
Considered splitting the database by country, however this will cause a problem if the nearest location is in a neighboring country.
Any other ideas? Is there a way to possibly limit the longitude / latitude in the query eg + or - 2 degrees?
SELECT name,path, ( 6371 * acos( cos( radians(?) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians(?) ) + sin( radians(?) ) * sin( radians( latitude ) ) ) ) AS distance FROM ".$GLOBALS['table']." HAVING distance < 200 AND path IS NOT NULL
Do not use latitude and longitude columns, as this way indices are useless since you need to calculate the distance metric for each record every time you query, with no ability to optimise it.
MySQL now supports geospatial data using POINT datatype and CREATE SPATIAL INDEX, which MySQL knows how to optimise.
Something like this; though MySQL 8.0 should be even better.

How to search nearby location with kilometer in MySQL

I have the database like as below structure. And how can I get location_id in list within 5 kilometer. There have latitude and longitude numbers are already in the database table. Please see my database structure image.
- school_id
- location_id
- school_name
- lat
- lng
Here is the database structure image:
I have already searched from this link
How to find nearest location using latitude and longitude from sql database?
and i don't understand the code.
SELECT id, ( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) *
cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin(
radians( lat ) ) ) ) AS distance FROM markers HAVING distance < 25
ORDER BY distance LIMIT 0 , 20;
The constant literal 3959 represents an approximation of the radius of the earth, in miles. That's why the "great circle distance" expression is returning a value in miles.
To get distance in kilometers, just replace 3959 with 6371, an approximation of the earth's radius in km.
Reference: https://en.wikipedia.org/wiki/Great-circle_distance
What the query is doing is calculating a distance (in miles) between two points on the earth, represented by degrees latitude and degrees longitude.
One the points is represented by literal values in the GCD expression (37.000000,-122.000000). The other point is (lat,lng) (degrees latitude and degrees longitude) from the row in the database.
The query cranks through every row in the table, and evaluates the GCD expression to calculate a distance. (The length of shortest line along the surface of the sphere between the two points.)
The HAVING distance < 25 clause excludes any row where the calculated distance is either greater than or equal to 25 or NULL.
The ORDER BY distance clause returns the rows in sequence by ascending values of distance, the closest points first.
The LIMIT 20 clause restricts the return to the first twenty rows.
FOLLOWUP
Within five kilometers of what? The Santa Monica Pier Aquarium?
That's latitude 34.010396, longitude -118.496029.
We can set user-defined variables (to avoid spreading literals in our query text):
SET #lat = 34.010396 ;
SET #lng = -118.496029 ;
Our SQL text include in the SELECT list the columns we want to return from our table. We'll also included a complicated looking "Great Circle Distance" expression that returns a distance in kilometers.
Something Like this:
SELECT m.school_id
, m.location_id
, m.school_name
, m.lat
, m.lng
, ( ACOS( COS( RADIANS( #lat ) )
* COS( RADIANS( m.lat ) )
* COS( RADIANS( m.lng ) - RADIANS( #lng ) )
+ SIN( RADIANS( #lat ) )
* SIN( RADIANS( m.lat ) )
)
* 6371
) AS distance_in_km
FROM mytable m
ORDER BY distance_in_km ASC
LIMIT 100
The GCD formula in the expression is calculating a distance between two points.
In this query, one of the points is a constant (#lat,#lng), which we previously set to the coordinates of the Santa Monica Pier Aquarium.
The other point is (m.lat,m.lng), the latitude and longitude from the row in the table.
So in this query, distance_in_km represents the distance between (lat,lng) of the row in the table and the Santa Monica Pier Aquarium.
Because distance_in_km value is not available at the time the rows are accessed, we can't reference that in a WHERE clause.
But we can reference it in a HAVING clause. That's simliar to a WHERE in that it filters out rows, but is much different, because it is evaluated much later in the query execution. And it can reference expressions that aren't available when the rows are being accessed, when the WHERE clause is evaluated.
We can modify our query to include the HAVING clause. In this case, we're limiting to rows that are within 100 kilometers, and we'll return only the closest 12 rows...
FROM mytable m
HAVING distance_in_km <= 100
ORDER BY distance_in_km ASC
LIMIT 12
If we want to find the distance to some point other than the Santa Monica Pier, we set #lat and #lng for that point, and re-execute the SQL.
SELECT *, ((ACOS(SIN(inputLat * PI() / 180) *
SIN(tableColLat * PI() / 180) + COS(inputLat * PI() / 180) *
COS(tableColLat * PI() / 180) * COS((inputLng - tableColLng) * PI() / 180)) * 180 / PI()) * 60 * 1.1515)
as distance FROM gm_shop HAVING distance <= 5 ORDER BY distance ASC;
distance is you can change. it is KM
When I solved a similar problem instead of worrying about curvature of the earth and angles. I just used the Pythagorean distance. It's a close enough approximation IMHO.
i.e. getting all schools that are in an approximate 5km Pythagorean distances.
select sqrt(pow(lat - curlat,2) + pow(lng - curlng,2)) as distance from markers having distance < XXXXX
You'll have to calculate the approximate value for XXXX. I degree is approximately 111km. If you aren't doing any extreme latitudes you shouldn't have to worry about adjusting it. So you could set XXXX at 0.045

How do I calculate distance using a Google Maps Fusion Table?

I am working with Google Map Fusion Tables and recently faced a tough problem while getting required data.
I am using below query:
SELECT geometry, ZIP, latitude, longitude,( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance FROM 18n-gPzxv_usPqtFJka9ytDArJgi3Hh8tlGnfuwrN WHERE distance <= 25
But the query is returning "could not parse query" error. I also tried below query but i got same error.
SELECT geometry, ZIP, latitude, longitude FROM 18n-gPzxv_usPqtFJka9ytDArJgi3Hh8tlGnfuwrN WHERE ST_DISTANCE(LATLNG(24.547123404292083, -114.32373084375001), LATLNG(37.4,-122.1)) <= 25
I can't calculate the distance after fetching all of the records. it's like millions of records (table size as of now is 10MB). I need a solution like just as we could fetch rows from a MYSQL table using spatial function like ST_DISTANCE or using the distance formula.
If any one could help giving some alternate or some out of the box solution, it would be awesome :)
You can't use functions like acos or sin in a FusionTable-Query, the supported functions are aggregrate-functions:COUNT
SUM
AVERAGE
MAXIMUM
MINIMUM
ST_DISTANCE expects the first argument to be the name of a Location-column
ST_DISTANCE may not be used in a WHERE-clause, it's only supported in ORDER BY
summary: what you are trying to achieve is (currently) not possible with a FusionTable-Query

User Spatial Point column OR not in MySQL for Performance

I am using MySQL in my application to store a list of cities.
each long/lat represent the center of the city.
I want to be able to pull all cities that are close to a specific city by the distance of X kilometers.
My question is what will be performing faster for that purpose.
Using the Point column, and use "spatial" queries to retrieve the data ?
OR
Using a Float Longitude column And Float Latitude column. and then use java code to generate the long/lat between distance before running the SQL WHERE BETWEEN query on those values .
Another small question I have, does it make sense to request all cities that are 10 Kilometers from New York. When New York range is probably bigger then 10 kilometers?
Spatial extension will always be better in this case, since it's based on R-Tree indexes, which are optimized for range search in N-dimensional space.
Whereas native mysql indexes are B-Tree and in the best case only one field from the index will be used (for the range comparison), or no index at all (in case if you use some advanced geo formulas like in another answer).
You can use the Haversine formula to query the database.
The query below is using PDO
$stmt = $dbh->prepare("SELECT name, lat, lng, ( 6371 * acos( cos( radians(?) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(?) ) + sin( radians(?) ) * sin( radians( lat ) ) ) ) AS distance FROM mytable HAVING distance < ? ORDER BY distance LIMIT 0 , 20");
// Assign parameters
$stmt->bindParam(1,$center_lat);
$stmt->bindParam(2,$center_lng);
$stmt->bindParam(3,$center_lat);
$stmt->bindParam(4,$radius);
Where
6371 is the radius of Earth in km
$center_lat & $center_lng cordinates of location
$center_lng radius of search
This query took 1.93 secs to run on a 457K row unindexed database.
name Varchar(50)
lat Decimal(9,6)
lng Decimal(9,6)