Optimize my nearby locations MYSQL query - mysql

I have a query which is causing my site to run very slow. Essentially it identifies the nearest location to a longitude / latitude point from a database of around 2m records.
Currently this query takes 7seconds to complete.
I have done the following to speed it up (before it was more than 15 seconds!)
Added index keys to name / longitude / latitude / path
stored the path in the database so that it does not need to run
Stored results into another table so we do not have to run the query again.
Considered splitting the database by country, however this will cause a problem if the nearest location is in a neighboring country.
Any other ideas? Is there a way to possibly limit the longitude / latitude in the query eg + or - 2 degrees?
SELECT name,path, ( 6371 * acos( cos( radians(?) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians(?) ) + sin( radians(?) ) * sin( radians( latitude ) ) ) ) AS distance FROM ".$GLOBALS['table']." HAVING distance < 200 AND path IS NOT NULL

Do not use latitude and longitude columns, as this way indices are useless since you need to calculate the distance metric for each record every time you query, with no ability to optimise it.
MySQL now supports geospatial data using POINT datatype and CREATE SPATIAL INDEX, which MySQL knows how to optimise.
Something like this; though MySQL 8.0 should be even better.

Related

Optimizing nearby locations MYSQL query via Spatial Indexes

I am trying to optimize my SQL query to display the nearest locations.
Originally I was using the following query
SELECT name,path, ( 6371 * acos( cos( radians(-36.848461) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians(174.763336) ) + sin( radians(-36.848461) ) * sin( radians( latitude ) ) ) ) AS distance FROM cityDB HAVING distance < 200 AND path IS NOT NULL
This query uses the longitude / latitude values to calculate the distance and takes 14.5 seconds to complete (too slow!)
I created a spatial index (point) of the longitude / latitude values in an effort to speed up the query.
SELECT DISTINCT name,
(ST_Length(ST_LineStringFromWKB(
LineString(
pt,
ST_PointFromText('POINT(174.763336 -36.848461)', 4326)))))
AS distance
FROM CityDB
ORDER BY distance ASC LIMIT 99
However this is averaging 23seconds even longer than the first query which really surprised me!
Is there something else I should be doing with my query to speed it up?
The only thing that I got to get it below 1 second was to add this (thanks to this post Improving performance of spatial MySQL query)
longitude BETWEEN longpoint - (50.0 / (111.045 * COS(RADIANS(latpoint))))
AND longpoint + (50.0 / (111.045 * COS(RADIANS(latpoint))))
However there are a couple of bugs in the code. If I chose Fiji, it will only display locations < 180 latitude, If I chose Vanuatu it will only choose locations < -180 latitude.
SELECT DISTINCT name, (ST_Length
(ST_LineStringFromWKB(LineString( pt, ST_PointFromText('POINT(-179.276277 -18.378639)', 4326)))))
AS distance FROM CityDB
WHERE longitude BETWEEN -179.276277 - (50.0 / (111.045*COS(RADIANS(-18.378639)))) AND -179.276277 + (50.0 / (111.045 * COS(RADIANS(-18.378639))))
GROUP BY (name) ORDER BY distance ASC LIMIT 99
In addition to this, it will also display locations in Russia which is miles away (I know that we can add a limit to distance but there must be a better way than this)
Any other efficient ways to make such a query?

How do I calculate distance using a Google Maps Fusion Table?

I am working with Google Map Fusion Tables and recently faced a tough problem while getting required data.
I am using below query:
SELECT geometry, ZIP, latitude, longitude,( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance FROM 18n-gPzxv_usPqtFJka9ytDArJgi3Hh8tlGnfuwrN WHERE distance <= 25
But the query is returning "could not parse query" error. I also tried below query but i got same error.
SELECT geometry, ZIP, latitude, longitude FROM 18n-gPzxv_usPqtFJka9ytDArJgi3Hh8tlGnfuwrN WHERE ST_DISTANCE(LATLNG(24.547123404292083, -114.32373084375001), LATLNG(37.4,-122.1)) <= 25
I can't calculate the distance after fetching all of the records. it's like millions of records (table size as of now is 10MB). I need a solution like just as we could fetch rows from a MYSQL table using spatial function like ST_DISTANCE or using the distance formula.
If any one could help giving some alternate or some out of the box solution, it would be awesome :)
You can't use functions like acos or sin in a FusionTable-Query, the supported functions are aggregrate-functions:COUNT
SUM
AVERAGE
MAXIMUM
MINIMUM
ST_DISTANCE expects the first argument to be the name of a Location-column
ST_DISTANCE may not be used in a WHERE-clause, it's only supported in ORDER BY
summary: what you are trying to achieve is (currently) not possible with a FusionTable-Query

Geolocation queries in Doctrine2

I am using Doctrine2 and CodeIgniter2 for my test application. I have a table in my database that stores all the geographic locations have fields
Name
Latitude
Longitude
Created(Timestamp)
I see that the sql statement by haversine formula to select locations will look like
(as mentioned in another answer)
SELECT id,
( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance
FROM markers
HAVING distance < 25
ORDER BY distance
LIMIT 0 , 20;
Now I am finding it difficult to do this using create query builder. I am not sure if DQL or querybuilder even supports trigonometric functions. Also there are chances that my db will be migrated to postgre or can stay with MySql (yes, this is really a pain in the back) as that decision is out of my control.
All I was told is to use doctrine's methods to achieve this and hence the db will become scalable in the future once it migrates to any of the doctrine's supported platforms. I know this is absurd. But is it really possible to query geolocation data using the latitude and longitude values in the database?
Regards,
Ashok Srinivasan
DQL only provides the following functions:
ABS
CONCAT
CURRENT_DATE()
CURRENT_TIME()
CURRENT_TIMESTAMP()
LENGTH(str)
LOCATE(needle, haystack [, offset])
LOWER(str)
MOD(a, b)
SIZE(collection)
SQRT(q)
SUBSTRING(str, start [, length])
UPPER(str)
DATE_ADD(date, days, unit)
DATE_SUB(date, days, unit)
DATE_DIFF(date1, date2)
However, you can create your own functions (radians for example) Adding your own functions to the DQL language.

User Spatial Point column OR not in MySQL for Performance

I am using MySQL in my application to store a list of cities.
each long/lat represent the center of the city.
I want to be able to pull all cities that are close to a specific city by the distance of X kilometers.
My question is what will be performing faster for that purpose.
Using the Point column, and use "spatial" queries to retrieve the data ?
OR
Using a Float Longitude column And Float Latitude column. and then use java code to generate the long/lat between distance before running the SQL WHERE BETWEEN query on those values .
Another small question I have, does it make sense to request all cities that are 10 Kilometers from New York. When New York range is probably bigger then 10 kilometers?
Spatial extension will always be better in this case, since it's based on R-Tree indexes, which are optimized for range search in N-dimensional space.
Whereas native mysql indexes are B-Tree and in the best case only one field from the index will be used (for the range comparison), or no index at all (in case if you use some advanced geo formulas like in another answer).
You can use the Haversine formula to query the database.
The query below is using PDO
$stmt = $dbh->prepare("SELECT name, lat, lng, ( 6371 * acos( cos( radians(?) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(?) ) + sin( radians(?) ) * sin( radians( lat ) ) ) ) AS distance FROM mytable HAVING distance < ? ORDER BY distance LIMIT 0 , 20");
// Assign parameters
$stmt->bindParam(1,$center_lat);
$stmt->bindParam(2,$center_lng);
$stmt->bindParam(3,$center_lat);
$stmt->bindParam(4,$radius);
Where
6371 is the radius of Earth in km
$center_lat & $center_lng cordinates of location
$center_lng radius of search
This query took 1.93 secs to run on a 457K row unindexed database.
name Varchar(50)
lat Decimal(9,6)
lng Decimal(9,6)

Geolocation and Haversine formula

I am trying to create a basic web application that detects the users geolocation, queries a mySQL database, and returns all bus stops within say 5 kilometers.
The GTFS feed including the Longitude and Latitude have been inserted into a mySQL database, and I found a example HTML page that provides the Longitude and Latitude of the browser accessing the web application.
I am seeking some help writing the mySQL query that takes this information and returns the results.
Although the great circle formula is precise, you don't need the precision in this case. A minute of latitude is about 1 mile (1.6 km). A minute of longitude is about cos(LAT)*1 mile. I would consider selecting for the box of LAT +/- 3 minutes, and LONG +/- (3/cos(LAT)) minutes. If you really need a circle, not a box, then just pretend it's Euclidean coordinates. The error on this scale is less than the length of the bus.
The only tricky part is that the length of a minute of longitude varies depending on how far from the equator you are.
In this link you find:
SELECT id, ( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance FROM markers HAVING distance < 25 ORDER BY distance LIMIT 0 , 20;