How to optimize complex calculation query execution time? - mysql

I have a query like this :
SELECT *, (
6371 * acos (
cos ( radians(33.577718) )
* cos( radians( `Latitude` ) )
* cos( radians( `Longitude` ) - radians(115.846524) )
+ sin ( radians(33.577718) )
* sin( radians( `Latitude` ) )
)
) AS `distance`
FROM `geopc_cn_places_grouped`
WHERE `Latitude`!=33.577718 AND `Longitude`!=115.846524
HAVING `distance` < 200
ORDER BY `distance` ASC
LIMIT 30;
The query execution is always somewhere between 3.5 and 4 seconds.
I have applied a composite index to Latitude and Longitude by running ALTER TABLE geopc_cn_places_grouped ADD INDEX index_Longitude_Latitude(Longitude, Latitude);, but it doesn't reduce the execution time.
I want to know why it's running slow and what possible optimizations can be done.
The slow query log message shows this
and this is the EXPLAIN SELECT query
Table Structure...
and lastly, here is the table index list

Your query as written isn't sargable. That is, it cannot exploit any index. So, each time you run it, you use that big spherical cosine law formula for every row in your table. It's a full table scan. It's likely that most of your slowness comes from the table scan, because modern computers do the math pretty quickly once they have the data in RAM.
But, you're in luck. Your search looks for points within a 200 statute mile radius of your candidate point. That means you can usea WHERE ... BETWEEN clause to eliminate points that are more than 200 miles south or north (latitude) of your starting point.
To do this you need to know there are 69.0 statute miles, 60 nautical miles, and 111.045 km in each degree of latitude. Therefore you should search for point ± (200/69) So.... try a query like this.
SELECT *, (
6371 * acos (
cos ( radians(33.577718) )
* cos( radians( `Latitude` ) )
* cos( radians( `Longitude` ) - radians(115.846524) )
+ sin ( radians(33.577718) )
* sin( radians( `Latitude` ) )
)
) AS `distance`
FROM `geopc_cn_places_grouped`
WHERE `Latitude`!=33.577718 AND `Longitude`!=115.846524
AND Latitude BETWEEN 33.577718 - (200/69) AND 33.577718 + (200/69)
HAVING `distance` < 200
ORDER BY `distance` ASC
LIMIT 30;
Then create an index on your Latitude column.
CREATE INDEX latsearch ON geopc_cn_places_grouped(Latitude);
The Latitude BETWEEN clause I suggest will then do an index range scan and so skip many of the rows in your table. That's the classic SQL way of making queries faster.
This is a simplification of the ideal answer to this question. I wrote up this problem here.

Your query must compute the distance for every row. The quick solution is to use a "bounding box". This limits the number of rows to test to a latitude stripe or longitude stripe.
Details (and more advanced speedups): http://mysql.rjweb.org/doc.php/find_nearest_in_mysql

Related

Optimizing nearby locations MYSQL query via Spatial Indexes

I am trying to optimize my SQL query to display the nearest locations.
Originally I was using the following query
SELECT name,path, ( 6371 * acos( cos( radians(-36.848461) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians(174.763336) ) + sin( radians(-36.848461) ) * sin( radians( latitude ) ) ) ) AS distance FROM cityDB HAVING distance < 200 AND path IS NOT NULL
This query uses the longitude / latitude values to calculate the distance and takes 14.5 seconds to complete (too slow!)
I created a spatial index (point) of the longitude / latitude values in an effort to speed up the query.
SELECT DISTINCT name,
(ST_Length(ST_LineStringFromWKB(
LineString(
pt,
ST_PointFromText('POINT(174.763336 -36.848461)', 4326)))))
AS distance
FROM CityDB
ORDER BY distance ASC LIMIT 99
However this is averaging 23seconds even longer than the first query which really surprised me!
Is there something else I should be doing with my query to speed it up?
The only thing that I got to get it below 1 second was to add this (thanks to this post Improving performance of spatial MySQL query)
longitude BETWEEN longpoint - (50.0 / (111.045 * COS(RADIANS(latpoint))))
AND longpoint + (50.0 / (111.045 * COS(RADIANS(latpoint))))
However there are a couple of bugs in the code. If I chose Fiji, it will only display locations < 180 latitude, If I chose Vanuatu it will only choose locations < -180 latitude.
SELECT DISTINCT name, (ST_Length
(ST_LineStringFromWKB(LineString( pt, ST_PointFromText('POINT(-179.276277 -18.378639)', 4326)))))
AS distance FROM CityDB
WHERE longitude BETWEEN -179.276277 - (50.0 / (111.045*COS(RADIANS(-18.378639)))) AND -179.276277 + (50.0 / (111.045 * COS(RADIANS(-18.378639))))
GROUP BY (name) ORDER BY distance ASC LIMIT 99
In addition to this, it will also display locations in Russia which is miles away (I know that we can add a limit to distance but there must be a better way than this)
Any other efficient ways to make such a query?

Optimize my nearby locations MYSQL query

I have a query which is causing my site to run very slow. Essentially it identifies the nearest location to a longitude / latitude point from a database of around 2m records.
Currently this query takes 7seconds to complete.
I have done the following to speed it up (before it was more than 15 seconds!)
Added index keys to name / longitude / latitude / path
stored the path in the database so that it does not need to run
Stored results into another table so we do not have to run the query again.
Considered splitting the database by country, however this will cause a problem if the nearest location is in a neighboring country.
Any other ideas? Is there a way to possibly limit the longitude / latitude in the query eg + or - 2 degrees?
SELECT name,path, ( 6371 * acos( cos( radians(?) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians(?) ) + sin( radians(?) ) * sin( radians( latitude ) ) ) ) AS distance FROM ".$GLOBALS['table']." HAVING distance < 200 AND path IS NOT NULL
Do not use latitude and longitude columns, as this way indices are useless since you need to calculate the distance metric for each record every time you query, with no ability to optimise it.
MySQL now supports geospatial data using POINT datatype and CREATE SPATIAL INDEX, which MySQL knows how to optimise.
Something like this; though MySQL 8.0 should be even better.

How do I calculate distance using a Google Maps Fusion Table?

I am working with Google Map Fusion Tables and recently faced a tough problem while getting required data.
I am using below query:
SELECT geometry, ZIP, latitude, longitude,( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance FROM 18n-gPzxv_usPqtFJka9ytDArJgi3Hh8tlGnfuwrN WHERE distance <= 25
But the query is returning "could not parse query" error. I also tried below query but i got same error.
SELECT geometry, ZIP, latitude, longitude FROM 18n-gPzxv_usPqtFJka9ytDArJgi3Hh8tlGnfuwrN WHERE ST_DISTANCE(LATLNG(24.547123404292083, -114.32373084375001), LATLNG(37.4,-122.1)) <= 25
I can't calculate the distance after fetching all of the records. it's like millions of records (table size as of now is 10MB). I need a solution like just as we could fetch rows from a MYSQL table using spatial function like ST_DISTANCE or using the distance formula.
If any one could help giving some alternate or some out of the box solution, it would be awesome :)
You can't use functions like acos or sin in a FusionTable-Query, the supported functions are aggregrate-functions:COUNT
SUM
AVERAGE
MAXIMUM
MINIMUM
ST_DISTANCE expects the first argument to be the name of a Location-column
ST_DISTANCE may not be used in a WHERE-clause, it's only supported in ORDER BY
summary: what you are trying to achieve is (currently) not possible with a FusionTable-Query

Geolocation queries in Doctrine2

I am using Doctrine2 and CodeIgniter2 for my test application. I have a table in my database that stores all the geographic locations have fields
Name
Latitude
Longitude
Created(Timestamp)
I see that the sql statement by haversine formula to select locations will look like
(as mentioned in another answer)
SELECT id,
( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance
FROM markers
HAVING distance < 25
ORDER BY distance
LIMIT 0 , 20;
Now I am finding it difficult to do this using create query builder. I am not sure if DQL or querybuilder even supports trigonometric functions. Also there are chances that my db will be migrated to postgre or can stay with MySql (yes, this is really a pain in the back) as that decision is out of my control.
All I was told is to use doctrine's methods to achieve this and hence the db will become scalable in the future once it migrates to any of the doctrine's supported platforms. I know this is absurd. But is it really possible to query geolocation data using the latitude and longitude values in the database?
Regards,
Ashok Srinivasan
DQL only provides the following functions:
ABS
CONCAT
CURRENT_DATE()
CURRENT_TIME()
CURRENT_TIMESTAMP()
LENGTH(str)
LOCATE(needle, haystack [, offset])
LOWER(str)
MOD(a, b)
SIZE(collection)
SQRT(q)
SUBSTRING(str, start [, length])
UPPER(str)
DATE_ADD(date, days, unit)
DATE_SUB(date, days, unit)
DATE_DIFF(date1, date2)
However, you can create your own functions (radians for example) Adding your own functions to the DQL language.

User Spatial Point column OR not in MySQL for Performance

I am using MySQL in my application to store a list of cities.
each long/lat represent the center of the city.
I want to be able to pull all cities that are close to a specific city by the distance of X kilometers.
My question is what will be performing faster for that purpose.
Using the Point column, and use "spatial" queries to retrieve the data ?
OR
Using a Float Longitude column And Float Latitude column. and then use java code to generate the long/lat between distance before running the SQL WHERE BETWEEN query on those values .
Another small question I have, does it make sense to request all cities that are 10 Kilometers from New York. When New York range is probably bigger then 10 kilometers?
Spatial extension will always be better in this case, since it's based on R-Tree indexes, which are optimized for range search in N-dimensional space.
Whereas native mysql indexes are B-Tree and in the best case only one field from the index will be used (for the range comparison), or no index at all (in case if you use some advanced geo formulas like in another answer).
You can use the Haversine formula to query the database.
The query below is using PDO
$stmt = $dbh->prepare("SELECT name, lat, lng, ( 6371 * acos( cos( radians(?) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(?) ) + sin( radians(?) ) * sin( radians( lat ) ) ) ) AS distance FROM mytable HAVING distance < ? ORDER BY distance LIMIT 0 , 20");
// Assign parameters
$stmt->bindParam(1,$center_lat);
$stmt->bindParam(2,$center_lng);
$stmt->bindParam(3,$center_lat);
$stmt->bindParam(4,$radius);
Where
6371 is the radius of Earth in km
$center_lat & $center_lng cordinates of location
$center_lng radius of search
This query took 1.93 secs to run on a 457K row unindexed database.
name Varchar(50)
lat Decimal(9,6)
lng Decimal(9,6)