MySQL select rows in Geolocation range - mysql

Background information: I am making an app(school project) where users can search for an item nearby. My app uses a table of Items where all item data is stored.
I need to get the rows which are in a certain diameter(5km) of my user's current position(51.337036, 4.645095). In the Items table there is column for latitude and longitude to locate the item.
In my current query I can select the rows which are in a square(4 coordinate points) but I need to have a sphere:
WHERE (C.latitude BETWEEN 50 AND 52) AND (C.longitude BETWEEN 3 AND 5)
Is it possible to use a MySQL Geolocation variable or function?
After some research I saw a type POINT, but my table doesn't have that type (the table needs to stay the same).

AS is said in the comment you have to have a point of valid origin
WHERE st_distance_sphere(POINT(-82.337036, 29.645095 ), POINT(C.`longitude`, C.`latitude` ))/1000 <= 2 AND T.difficulty = 1 AND T.terrain = 2;
This would give you all rows that are in max 2 km disance from POINT(-82.337036, 29.645095 )
The you have to put your own location

Related

Finding drivers from nearest locations of a given location

I have 3 tables.
Table 1 - drivers
Table 2 - location
Table 3 - distance
User will search for a driver that matches a location. In drivers table the location refers to the current location of the driver. If a driver is not available in a particular location, I want a query to search for a driver that is closest to the location that the user has provide.
Table 2 is the location names and Table 3 is the distance from one place to another. But the problem is, if distance from locationid 1 to locationid 2 is stored, opposite version is not (locationid 2 to locationid 1).
This is not straight forward.
First you need to get all the location with which the given location has a relationship. This result includes the locationid and distance.
The distance with the location itself is zero. I've taken help of UNION ALL in order to make a list of <locationid,distance>.
Then make an INNER JOIN between the above list and your drivers table on matching location.
And finally sort the result set based on distance in ascending order.
SELECT
*
FROM drivers DR
INNER JOIN
(
SELECT
locationid,
0 AS distance
FROM location
WHERE locationname = 'Gulshan'
UNION ALL
SELECT
IF(L.locationid = D.fromid, D.toid, D.fromid),
D.distance
FROM location L
INNER JOIN distance D ON L.locationid IN (D.fromid,D.toid)
WHERE locationname = 'Gulshan'
) AS t
ON DR.location = t.locationid
ORDER BY t.distance
See Rextester Demo
OR
See SQL Fiddle Demo
Note: You may include LIMIT n in order to restrict the result set containing at most top n search results.
You can also include a ..WHERE distance < MAX_ALLOWABLE_DISTANCE... in your query so that the final result makes some sort of sense.

Max latitude for x distance from longitude - Max longitude for x distance from latitude - SQL

Right now I have a table of 100 million inserts:
CREATE TABLE o (
id int UNIQUE,
latitude FLOAT(10, 8),
longitude FLOAT(11, 8)
);
On my back end I am receiving a user lat/long and trying to return everything within x distance of that.
Instead of doing the distance formula on every single result I was thinking I could possibly calculate the maximum lat/long for X distance.
So we are sort of creating a square by finding the max lat/min lat, max long/min long.
Once we have these max values we would do the query on this range of values thus making our subset significantly smaller to then do the actual distance formula on (i.e., finding the values within X distance).
So my question to you is:
What makes me run faster?
Option 1)
Distance formula on 100 million entries to get the set.
Option 2)
Instead of doing the distance formula on the set of 100 million entries we calculate the min/max lat/long.
Select the values in that range from the table of 100 million entries
Do the distance formula on our new smaller set.
Option 3)
Something exists already for this in SQL
If option 2 is faster the next issue is actually solving that math problem.
If you want to look at that continue reading:
Lat/Long distance formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = (sin(dlat/2))^2 + cos(lat1) * cos(lat2) * (sin(dlon/2))^2
c = 2 * atan2(sqrt(a), sqrt(1-a))
d = R * c
Obviously we can rearrange this because D (assume 1 mile), and R (is the radius of the earth) is a set value so we get D/R = C.
The problem then comes in to how do we calculate C/2 = atan2(sqrt(a), sqrt(1-a))?
1 -- 100M rows is a lot to scan and test. It's OK do do once in a while, but it is too slow to do a lot.
2 -- Using a pseudo-square bounding box and doing
WHERE latitude BETWEEN ...
AND longitude BETWEEN ...
is a good first step. The latitude range is a simple constant times X; the longitude range also divides by cos(latitude).
But the problem comes when you try to find just those rows in the square. Any combination of index on latitude and/or longitude, either separately or together, will only partially filter. That is, it will ignore longitude and give you everything within the latitude range, or vice versa. That might get you down to 100,000 rows to check the distance against. That's a lot better than 100,000,000, but not as good as you would hope for.
3 -- http://mysql.rjweb.org/doc.php/latlng Does get down to the square, or very close. It is designed to scale. I have tested only 3M rows, not 100M, but it should work fine.
The main trick is to partition on latitude, then have longitude be the first column in the PRIMARY KEY so that InnoDB will cluster the nearby rows nearby in the partition(s). If you look for all rows within X miles (or km) it might look at (and compute the great-circle-distance) for about twice as many rows as necessary, not 100K. If you want to find the nearest 100 items, it might touch about 400 (4x).
As for SPATIAL index, you might want to upgrade to 5.7.6, which is when ST_Distance_Sphere() and ST_MakeEnvelope() were added. (MakeEnvelope is only marginally more convenient than building a Polygon yourself -- it has flat-earth syndrome.)

Lat/Long Distance Comparison

I am having some trouble with figuring out how to do this. What I have is a list of 160K locations on an Access table with lat and long coordinates for each. I am trying to find out how to create a column that compares 1 item on the list to the rest of the items to bring back the closest distance in miles.
I've figured out how to use the haversine formula to make a 1 to 1 comparison but I am lost in trying to automate the rest.
This is basically what I want to try to produce...
Loc_ID Loc_Lat Loc_Long Min_Miles_Away
1 33.537214 -81.687378 674.48
4 42.16584 -87.845117 11.83
5 41.99558 -87.869057 11.83
6 41.85325 -89.486883 83.75
Explanation to the table...
Location 1 is closest to location 5 (674.48 miles apart)
Location 4 is closest to location 5 (11.83 miles apart)
Location 5 is closest to location 4 (11.83 miles apart)
Location 6 is closest to location 5 (83.75 miles apart)
Any help would be appreciated.
You can do a cartesian join, i.e. a join without a where. It will join each row with every other row. You can do that by simply writing the SQL into the SQL view of the query.
SELECT *
FROM locations a, locations b
Next you can calculate the distance (I guess you have that code already, so just insert the function) on that table.
Finally you can group by MIN.
SELECT loc_id, loc_lat, loc_long, MIN(calulated_distance) as min_miles_away
FROM myCalculatedQuery

Calculate amount of steps needed to navigate through a crawled website from one URL to another

Situation:
A single domain is crawled completely (Up to 10.000.000 URLs) and all URLs are saved into a MySql database table. Every URL is given a unique ID. All the Links between the URLs are saved in another Table. For example The URL with the ID 1 links to the URL with the ID 893. One URL can link to n others, backlinks and loops are possible (URL 1 Linking to URL 6. URL 6 Linking to URL 3 and URL 3 Linking back to URL 1). Because of the crawling nature every URL must have a path to the root URL.
My goal is to calculate the amount of steps required to get from the root level to a given URL. In the end I want to provide the information to the user that URL 89 is 12 links away from the root level (shortest path found).
This problem has probably been solved before so is the a paper or even an example on how to solve this without bruteforcing it?
Algorithm:
Set root url distance to zero and others to null
Begin Loop
Find urls matching the current distance
Find their linked urls and if they are null set their distance to current + 1
Increment current distance
Loop if there are urls with distance not set yet
Have tested it with your data (942 urls, 27008 links) and got following results:
Shortest clicks from the start page:
Distance Count
0 1
1 149
2 600
3 141
4 38
5 7
6 6
Shortest clicks from/back to start page (uncomment the 3 lines UNION and SELECT):
Distance Count
0 1
1 494
2 447
I have put it on sql fiddle with a small amount of my own test data (had to use SQL Server as it only allows Select queries for MySQL).
http://sqlfiddle.com/#!6/efdd1/4
UPDATE crawl_urls SET Distance = NULL -- Reset distances for the test
UPDATE crawl_urls SET Distance = 0 -- Start Root Url at 0 distance
WHERE ID = (SELECT MIN(ID) FROM crawl_urls)
DECLARE #UrlsToDo int = -1 -- Count of Urls still to process
DECLARE #Distance int = 0 -- Current Distance from root
WHILE (#UrlsToDo != 0) -- Loop while urls to process
BEGIN
UPDATE crawl_urls -- Find urls at current distance
SET Distance = #Distance + 1 -- Set their linked urls distance
WHERE Distance IS NULL AND ID IN (
SELECT target_urls_id IDs FROM Links L1
INNER JOIN crawl_urls A ON L1.crawl_urls_id = A.ID AND A.Distance = #Distance
--UNION ALL -- Union of both sides of link
-- SELECT crawl_urls_id IDs FROM Links L2 -- Uncomment for shortest way BACK
-- INNER JOIN crawl_urls B ON L2.target_urls_id = B.ID AND B.Distance = #Distance
)
SET #UrlsToDo = (SELECT COUNT(ID) FROM crawl_urls WHERE Distance IS NULL)
SET #Distance = #Distance + 1
END -- Increment Distance and loop
SELECT * FROM crawl_urls ORDER BY Distance -- Output results
Things to note: You will need to make sure root url distance is 0 at the start. Also be aware that the loop could go indefinitely if there is an orphan url with no links to others, although this should not be possible in theory unless there were errors while crawling and records were skipped. Proper indexing will make a huge difference with bigger data sets.
I will be doing something almost identical to this soon and here are some other things I have noticed. There were 5% duplicates in the Links table and only allowing uniques would greatly speed things up - less records and better indexes. Also the home page has been added twice (with and without the '/' at end) so there will be a lot of extra duplicate links there in both directions, this may also apply to search friendly urls and folder names.

select places with nearly same location (duplicates) by latitude/longitude

Lets say I have a table venues with following columns:
id
user_id
name
latitude
longitude
The latitude and longitude are kept as FLOAT(10,6) values. As different users add venues, there are venue duplicates. How can I select all the duplicates from the table in range up to lets say 50 metres (as it might be hard to achieve as the longitudial meter equivalents are different at different latitudes, so this is absolutely aproximate)? The query should select all venues: VenueA and VenueB (there might be VenueC, VenueD, etc) so that I can compare them. It should filter out venues that are actually one per location in the range (I care only for duplicates).
I was looking for an answer but had to settle with answering myself.
SELECT s1.id, s1.name, s2.id, s2.name FROM venues s1, venues s2
WHERE s2.id > s1.id AND
(POW(s1.latitude - s2.latitude, 2) + POW(s1.longitude - s2.longitude, 2) < 0.001)
The first condition is to select only half of matrix as order of similar venues is not important. The second one is simplified distance calculator. As user185631 suggested haversine formula should do the trick if you need more precision but I didn't need it as I was looking for duplicates with the same coordinates but couldn't settle with s1.latitude = s2.latitude AND s1.longitude = s2.longitude due to float/decimal corruption in my DB.
Of course checking this at insert would be better but if you get corrupt DB you need to clean it somehow. Please also note that this query is heavy on server if your tables are big.
Create a function which computes distances between lat/lons. For small/less accurate distance (which is the case here) you can use the Equirectangular approximation (see section here: http://www.movable-type.co.uk/scripts/latlong.html). If the distance is less than your chosen threshold (50m), then it is a duplicate.
Determine what 50 meters is in terms of lat and long. Then plus and minus that to your starting location to come up with a max and min for both lat and long. Then...
SELECT id FROM venues WHERE latitude < (your max latitude) AND latitude > (your min latitude) AND longitude < (your max longitude) AND longitude > (your min longitude);
Converting meters to lat/long is very tricky as it depends on where the starting point is on the globe. See the middle section of the page here: http://www.uwgb.edu/dutchs/usefuldata/utmformulas.htm