I'm searching for circle select by distance. I have one point with latitude & longitude and I want to search if I have in database some points around me. And yes, it's must be a circle!
I'm using this clause in query (I just google it, I can't do math):
((6373 * acos (cos ( radians( 48.568962 ) ) * cos( radians( X(coords) ) ) * cos( radians( Y(coords) ) - radians( 6.821352 ) ) + sin ( radians( 48.568962 ) ) * sin( radians( X(coords) ) ))) <='0.2')
0.2 = 200 meters
I'm using POINT data type
Yes, I have SPATIAL index on it
Yes, I'm trying to use the "spatial" functions, but it's not returning a circle, it's returning some OVAL and i need PRECISE circle
This "circle" clause takes very, very, VERY long time for all tables. When I'm using the OVAL method of SPATIAL foos. It takes maybe 0.1s and that's great! But I need circle and this takes 17 sec, LOL.
Can you help me someone? Thanks a lot guys!
EDIT: spatial functions means some like this:
WHERE ST_Contains(ST_Buffer(
ST_GeomFromText('POINT(12.3456 34.5678)'), (0.00001*1000)) , coords) <= 1 /* 1 km */
EDIT 2 (table struct.):
I'm expecting 10 rows from this tables of course I have indexes on wz_uuid
select a....., b.... from table_1 a left join table_2 b on a.wz_uuid=b.wz_uuid
And this is not just 2 tables, i have 11 tables *2 like this. (weekly database backups). First tables (_1) have 0-4000 rows, 2-11 have 300k+ rows.
All indexes are relevant and also data types & encoding.
wz_uuid & id - unique, btree index
others - btree indexes
coords - spatial index
Great solution from XX sec to 100 ms, that's all I want :-)
Use MySQL spatial extensions to select points inside circle
Related
What's the best way to optimize this query?
$tripsNearLocation = mysqli_query($con,
"SELECT * FROM (
SELECT *
, ( 3959 * acos( cos(" . $latRad . ")
* cos( radians( startingLatitude ) )
* cos( radians( startingLongitude )
- (" . $longRad . ") )
+ sin(" . $latRad . ")
* sin( radians( startingLatitude ) ) ) )
AS distance FROM trips
) as query
WHERE distance < 10
ORDER BY distance LIMIT 0 , 10;");
With 50,000 rows it takes it a second or two to finish. Should I add a different query that eliminates all rows that aren't even in the "close range" of the coordinates inputted then calculate the remaining rows? Say if the latitude coordinate inputted is 67, eliminate all rows with latitude coordinate that isn't from 65-69.
Or add a "state column" where it removes all rows from calculations if they aren't in the same state?
Or just deal with the 2 seconds of calculations? I'm worried the database may contain more that 100,000 rows and it will take to long to excute.
Plan A: For 100K rows, you might get away with just narrowing down by latitude. That is,
calculate the degrees latitude that corresponds to "10" units of distance
Have INDEX(startingLatitude)
Add to the WHERE clause to limit it to startingLatitude plus/minus "10". Perhaps your example is AND startingLatitude BETWEEN 65 AND 69.
If you are thinking about using INDEX(lat, lng), it is not as simple. See if Lat is good enough.
Plan B: Next choice will involve lat and lng, plus a subquery. And version 5.6 would be beneficial. It's something like this (after including INDEX(lat, lng, id)):
SELECT ... FROM (
SELECT id FROM tbl
WHERE lat BETWEEN...
AND lng BETWEEN... ) x
JOIN tbl USING (id)
WHERE ...;
For various reasons, Plan B is only slightly better than Plan A.
Plan C: If you are going to need millions of rows, you will need my pizza parlor algorithm. This involves a Stored Procedure to repeatedly probe, looking for enough rows. It also involves PARTITIONing to get a crude 2D index.
Plans A and B are O(sqrt(N)); Plan C is O(1). That is, for Plans A and B, if you quadruple the number of rows, you double the time taken. Plan C does not get slower. (It sounded like your code is O(N) -- double the rows = double the time.)
This is how I ended up solving it incase people need to reference this in the future.
$tripsNearLocation = mysqli_query($con, "SELECT * FROM (
SELECT *, (3959 * acos(cos(" . $latRad . ") * cos(radians(startingLatitude))
* cos(radians(startingLongitude) - (" . $longRad . ")) + sin(" . $latRad . ")
* sin(radians(startingLatitude)))) AS distance FROM (
SELECT * FROM trips_test WHERE startingLatitude BETWEEN " .
($locationLatitude - 1) . " AND " . ($locationLatitude + 1) . ") as query1)
as query2 WHERE distance < 10 ORDER BY distance LIMIT 0 , 10;");
Although I will accept Rick James' answer as he helped me get to this solution.
i have the following query to access the nearest locations around the given lat-lon.
I followed Mr.Ollie's blog Nearest-location finder for MySQL to find nearest locations around given lat-long using haversine formula.
But due to lack of much knowledge in spatial data query i failed to execute it properly, so looking for an expert's advice to solve this.
Here is my query
SELECT z.id,
p.distance_unit
* DEGREES(ACOS(COS(RADIANS(p.latpoint))
* COS(RADIANS(z.(x(property))))
* COS(RADIANS(p.longpoint) - RADIANS(z.(y(property))))
+ SIN(RADIANS(p.latpoint))
* SIN(RADIANS(z.(x(property)))))) AS distance_in_km
FROM mytable AS z
JOIN ( /* these are the query parameters */
SELECT 12.00 AS latpoint, 77.00 AS longpoint,
20.0 AS radius, 111.045 AS distance_unit
) AS p
WHERE z.(x(property))
BETWEEN p.latpoint - (p.radius / p.distance_unit)
AND p.latpoint + (p.radius / p.distance_unit)
AND z.(y(property)
BETWEEN p.longpoint - (p.radius / (p.distance_unit * COS(RADIANS(p.latpoint))))
AND p.longpoint + (p.radius / (p.distance_unit * COS(RADIANS(p.latpoint))))
ORDER BY distance_in_km
LIMIT 15;
when i run this query i'm getting error as
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(x(property)))) * COS(RADIANS(p.longpoint) - RADIANS(z.(y(geo' at line 1
i also tried z.(GeomFromText(x.property)))
this is my table desc
+-----------+----------------+
| Field | Type |
+-----------+----------------+
| id | Int(10) |
| property | geometry |
+-----------+----------------+
select x(property) from mytable; //gives me lat
select y(property) from mytable; //gives me lan
Where am i going wrong?
is this the way to achieve this.?
Please suggest.
It seems to me that you are assuming that once you have selected z.id in the query, that this gives you you direct access to the x(property) and y(property)
(Aside - do those names really have parentheses in them?)
So to me it looks like you should replace things like
* COS(RADIANS(z.(x(property))))
with something like
* COS(RADIANS( select x(property) from mytable where id = z.id ))
However on further thinking about it, I think that your mytable doesn't have the required structure. From looking at the link, I believe that your mytable should have a structure more like:
+-----------+----------------+
| Field | Type |
+-----------+----------------+
| id | Int(10) |
| latitude | Float |
| longitude | Float |
+-----------+----------------+
So that you can do something like
* COS(RADIANS(z.latitude))
NOTE
The above was based on me not understanding that MySQL supports spatial data types (for which I have no idea how to use)
Update
I just did some googling to understand the spatial types and found this:
How do you use MySQL spatial queries to find all records in X radius? [closed]
which suggests that you can't do what you want to do with spatial data types in mysql. Which thus brings you back to using a non-optimal way of storing data in mutable
However in re-reading that link, the comments to the answer suggest that you may now be able to use spatial data types. (I told you I didn't have a clue here) This would mean replacing the query code with things like ST_Distance(g1,g2), which effectively means totally rewriting the example.
To put it another way
The example you gave presumes that spatial data types and testing of
geometries do not exist in MySQL. But now that they do exist, they
make this set of example code irrelevant, and that you are in for a
world of hurt if you try and combine the two forms of analysis.
update 2
There are three paths you can follow:
Deny that spatial data types exists in MySQL and use a table that has explicit columns for lat and long, and use the sample code as originally written on that blog.
Embrace the MySQL spatial data types (warts and all) and take a look at things like this answer https://stackoverflow.com/a/21231960/31326 that seem to do what you want directly with spatial data types, but as noted in that answer there are some caveats.
Use a spatial type to hold your data, and use a pre-query to extract lat and long before passing it into the original sample code.
Finally i followed this link and ended with this.
query is not optimized yet but working great.
here is my query
select id, ( 3959 * acos( cos( radians(12.91841) ) * cos( radians( y(property) ) ) * cos( radians( x(property)) - radians(77.58631) ) + sin( radians(12.91841) ) * sin( radians(y(property) ) ) ) ) AS distance from mytable having distance < 10 order by distance limit 10;
This is a tough one to explain. I'm able to find all zipcodes within a radius of x miles. However what I want to do is find all Userids from tblUsers whos MaxDistance is <= x zipcode.
So in plain english I want to know all the people who are within a zipcode radius based on their MaxDistance
For example I have a table:
tblUsers(ID int, Maxdistance int,Zipcode varchar(5))
1|50|94129
2|25|94111
3|100|19019
In my second table:
tblTmpPlaces(ID int,Zipcode varchar(5))
1|94129
What I want to do is using tblTmpPlaces zipcode, I want to be able to say hey users 1 and 2 are within their max distance and select these. However user 3's max distance is 100 and not close enough to tblTmpPlaces zipcode of 94129. 94129 is San Fran, and 19019 is Philadelphia. The user is over 100 miles from San Fran.
This is what i've been using to get the distance but this uses a central location to find all within an area but it doesn't take into consideration MaxDistance. Any help is appreciated.
So basically select ID from tblUsers where this is the part i'm stumbling on
SELECT Zipcode
FROM tblZipcodes
WHERE ( 3959
* acos(
cos(
radians(
#XLocationParam))
* cos(
radians(
x(location)))
* cos(
radians(
y(location))
- radians(
#YLocationParam))
+ sin(
radians(
#XLocationParam))
* sin(
radians(
x(location)))) <= 30))
It really looks like you need the latitude and longitude for the "center" of each zipcode. Without that, MySQL can't calculate the distance between the zip codes.
tblZipcodeLatLong
( Zipcode varchar(5)
, latitude decimal(7,4)
, longitude decimal(7,4)
)
Then you could calculate the distance between all of the Zipcodes, using your Great Circle Distance (GCD) formula.
For performance, though, you'll likely not want to do that in each individual query, but rather, you'd want to pre-calculate the distance between all the Zipcodes, and have those calculated distances stored in a table.
SELECT p1.Zipcode AS p1_Zipcode
, p2.Zipcode AS p2_Zipcode
, <gcd_formula> AS distance
FROM tblZipcodeLatLong p1
CROSS
JOIN tblZipcodeLatLong p2
Where <gcd_distance> represents your great circle distance formula that calculates the distance between all of the zipcodes.
A query of this form would return the result set you are looking for:
SELECT u.*, p.*
FROM tblTmpPlaces p
JOIN (
SELECT p1.Zipcode AS p_Zipcode
, p2.Zipcode AS u_Zipcode
, <gcd_formula> AS distance
FROM tblZipcodeLatLong p1
CROSS
JOIN tblZipcodeLatLong p2
) d
ON d.p_Zipcode = p.Zipcode
JOIN tblUsers u
ON u.Zipcode = d.u_Zipcode
AND u.Maxdistance >= d.distance
WHERE p.Zipcode = '94129'
As I noted before, doing that cross join operation and calculating all those distances in that subquery (aliased as d) on each query could be quite a bit of overhead. For performance, you'd likely want those results pre-calculated, stored in an appropriately indexed table, and then replace that subquery with a reference to the pre-populated table.
NOTE:
I have a GCD formula in one of the other answers I posted here on stackoverflow a while back. I'll see if I can find it.
Similar question answered here:
MYSQL sorting by HAVING distance but not able to group?
Did the exact same thing, if i can understand, u are getting the zipcodes already (possibly in an array) simple thing to do would be to find an match of those zipcodes with the corresponding users using the IN operator in sql
SELECT *
FROM users
WHERE zipcode IN zipcodes_array;
zipcodes_array being the array u already have
I'm having trouble with this query. It takes about 5 seconds to run against ~300 users. I assume it's because it's calculating the distance for every possible user.
Is there a way for me to optimize this to make it run fast? Thanks in advance.
select
t2.*,
t2.city,
t2.state,
t2.county,
ifnull(round((6371 * acos( cos( radians('32.7211') ) * cos( radians( t2.latitude ) ) * cos( radians( t2.longitude ) - radians('-117.16431') ) + sin( radians('32.7211') ) * sin( radians( t2.latitude ) ) ) ),0),1) AS distance
from
users t1
inner join
zipcodes_coordinates t2
on t1.zip_code=t2.zipcode
having
distance <= 150
I would eliminate as much of the data as you can before the main bit of the query is run, which you list. Your query is almost certainly looping over every single row in the table.
For example, you know that if a user at (X,Y) is within an R mile circle of a certain point X',Y', then they are certainly within a square of diameter 2R, which means the following things hold:
X <= X' + R
X >= X' - R
Y <= Y' + R
Y >= Y' - R
So to make a query on the database, you could first have the database eliminate all users who's X value doesn't satisfy those constraints, and this can be done using the index on the field. (same goes for the Y co-ordinate)
Another (rather more domain-specific) trick would be to split the world up into small squares that are indexable with a single identifier (could be a long, or even a string with the co-ordinates of the centre so long as you could re-create them reliably from any co-ordinate within the square). Then store which square each co-ordinate is in as well as the co-ordinate itself. If you are looking for e.g a 5 mile radius, then make the squares something like 2 miles square. That way you can very quickly do a search on a small number of adjacent squares by identity (it would be no more than 9 in this case), then loop over the results in those squares to find the closest matches in your application.
Most performance optimisations in this kind of thing are about eliminating data that certainly doesn't fit and then refining, rather than immediately going after data that certainly does.
PS - if you are using MySQL there is a GIS extension, which I haven't tried: http://dev.mysql.com/tech-resources/articles/4.1/gis-with-mysql.html. This probably does something like what I describe, and may or may not take into account the curvature of the earth, etc. However in most cases the successive refinement method is fairly safe, and means your database doesn't have to 'know' about GIS co-ordinate systems.
In my DB i store a center point, along with a radius (in meters).
I'm looking to pass in a lat/lng, and then have the mysql values i've stored create a circle to tell me if my point i passed in is within that circle. Is there something that would allow me to do this, similar to the haversine forumla (which would assume that my point was already in the db).
Haversine Formula:
( 3959 * acos( cos( radians(40) ) * cos( radians( lat ) ) * cos( radians( long ) - radians(-110) ) + sin( radians(40) ) * sin( radians( long ) ) )
db:
circleLatCenter, circleLngCenter, Radius
passing in>
select id from foo where lat,lng in (make circle function: circleLat, circleLng, radius)
MySQL has a whole host of spatial data functions:
Spatial Extensions to MySQL
I think the section on measuring the relationships between geometries is what you're after:
Relationships Between Geometries
I've done similar geographical searches by computing the bounding box via great circle distance and querying the database for that. You still need another pass in your application to "round the corners" from bounding box to circle.
So, given a database of points, a search point (X,Y) and a distance D, find all points within D of (X,Y):
Compute deltaX, which is the point if you moved distance D along the Y axis.
Compute deltaY, which is the point if you moved distance D along the X axis.
Compute your bounding box: (X-deltaX,Y-deltaY),(X+deltaX,Y+deltaY)
Query database of points use SQL BETWEEN operator: SELECT * FROM TABLE WHERE X BETWEEN X-deltaX AND X+deltaX AND Y BETWEEN Y-deltaY AND Y+deltaY
Post-process the list of points returned, computing the actual great circle distance, to remove the points at the corners of the square that are not within your distance circle.
As a short-cut, I typically calculate degrees-per-mile for both lat and lon (at the equator, since the degrees-per-mile is different at the poles for lon), and derive deltaX and deltaY as (D * degrees-lat-per-mile) or degrees-lon-per-mile. The difference at the equator vs pole doesn't matter much, since I'm already computing actual distance after the SQL query.
FYI - 0.167469 to 0.014564 degrees-lon-per-mile, and 0.014483 degrees-lat-per-mile
I know this is a long-dead post, but, in case anyone ever comes across this, you don't need to create a "reverse haversine formula" at all. The Haversine formula gives the distance between point a and point b. You need the distance between point b and point a, for your calculation. These are the same value.
SELECT *,
( 3959 * acos( cos( radians(40) ) * cos( radians( `circleLatCenter` ) ) * cos( radians( `circleLngCenter` ) - radians(-110) ) + sin( radians(40) ) * sin( radians( `circleLngCenter` ) ) ) as `haversine`
FROM `table` WHERE 1=1
HAVING `haversine` < `Radius`