Generating random latitude and longitude in MySQL - mysql

I'm trying to generate some dummy records in MySQL and need to create random latitude and longitude float values within a given range.
For example I need to generate latitudes that are between 52.077090052913654 and 52.477040512464626
and longitudes between -1.8840792500000134 and -0.9172823750000134
I'm familiar with creating ranges of random numbers using Rand Floor but that will only produce whole numbers. How I might do this?

You can use RAND() to get a random value between two values like so:
randVal = (RAND() * (maxVal - minVal)) + minVal
With your numbers:
SELECT
(RAND() * (52.477040512464626 - 52.077090052913654)) + 52.077090052913654 AS randlatitude,
(RAND() * (-0.9172823750000134 - -1.8840792500000134)) + -1.8840792500000134 AS randlongitude

Related

Optimising this very slow MySQL Query

I'm not particularly knowledgeable about MYSQL queries and optimising them, so I require a bit of help on this one. I'm checking a table of international cities to find the 10 nearest cities based on the longitude and latitude values in the table.
The query I'm using for this is as follows:
SELECT City as city,
SQRT(POW(69.1 * (Latitude - 51.5073509), 2) +
POW(69.1 * (-0.1277583 - Longitude) * COS(Latitude / 57.3), 2)) AS distance
from `cities`
group by `City`
having distance < 50
order by `distance` asc
limit 10
(The longitude & latitude values are obviously placed dynamically in my code)
sometimes this can take around 3-4 mintues of my development environment to complete.
Have I made any classic mistakes here, or is there a much better query I should be using to retrieve this data?
Any help woould be greatly appreciated.
Assuming City is unique and you are abusing GROUP BY and HAVING in order to get a cleaner code
SELECT City as city,
SQRT(POW(69.1 * (Latitude - 51.5073509), 2) +
POW(69.1 * (-0.1277583 - Longitude) * COS(Latitude / 57.3), 2)) AS distance
from `cities`
where SQRT(POW(69.1 * (Latitude - 51.5073509), 2) +
POW(69.1 * (-0.1277583 - Longitude) * COS(Latitude / 57.3), 2)) < 50
order by `distance` asc
limit 10
If City is unique then the aggregation is done on single rows.
MySQL uses sort operation to implement GROUP BY.
Sort complexity is O(n*log(n)), so without indexes this is going to complexity of GROUP BY.
If City is not unique than the filtering in the HAVING CLAUSE is done on one arbitrary row which is for sure not what the OP intended.
The case where HAVING and WHERE are both relevant for filtering and HAVING has an performance advantage is where the filtering is done on the aggregated column, there are some heavy calculations and the GROUP BY operation significantly reduce the number of rows
select x,... from ... group by x having ... some heavy calculations on x ...

Methods for geographic distance search in MySQL

I'm looking for the fasted way to search for Points that are within a certain distance from another given Point. I have a MyISAM table with Points spatially indexed representing geographic locations (latitude, longitude).
If MySQL supported it, I think ST_DWithin would do the job. But it doesn't, so I got the following expression that uses a buffer to generate a circle and then look for points that fall within this circle :
ST_Within(geopoint, ST_Buffer(Point(#lat, #lng), #radius))
It seems to be working fine and I believe it uses the index. But is it a good enough solution? How precise is ST_Within and ST_Buffer for geography purposes?
UPDATE: I concluded that MySQL doesn't offer support for Geography coordinates and that all operations are done on a Euclidean plane (even if you specify the SRID). Depending on the location, that eventually leads to big imprecisions. So the coordinates need to be transformed prior to using MySQL Spatial functions.
We do something similar to this at work.
We get about 1 million queries per hour and when we were using spatial indexes, it would basically take the database down and queries would get put in a pending state. Some queries were pending for about 8,000 seconds (about 2 hours). So we had to find another way, and this was the best that we could come up with, it now no longer backs up the database, and returns results in milliseconds.
What we do is first we have a distance function which looks like this:
CREATE FUNCTION `distance`(`lat1` DECIMAL(10,7), `lon1` DECIMAL(10,7), `lat2` DECIMAL(10,7), `lon2` DECIMAL(10,7)) RETURNS double
BEGIN
DECLARE X DOUBLE;
DECLARE PI DECIMAL(21, 20);
SET PI = 3.14159265358979323846;
SET X = SIN(lat1 * PI / 180)
* SIN(lat2 * PI / 180)
+ COS(lat1 * PI / 180)
* COS(lat2 * PI / 180)
* COS((lon2 * PI / 180) - (lon1 * PI / 180));
SET X = ATAN((SQRT( 1- POWER( X, 2))) / X);
RETURN (1.852 * 60.0 * ((X / PI) * 180)) / 1.609344;
END
Remove / 1.609344 on the return line to get kilometers
We then have a procedure to calculate the distance between your location and the surrounding area. From what we tested this was the fastest (simplified version of what we have):
CREATE PROCEDURE `MyRadius`(IN `p_lat` DOUBLE, IN `p_long` DOUBLE, IN `radius` INT)
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
COMMENT ''
BEGIN
SELECT distance(p_lat, p_long, g.latitude, g.longitude) as distance, country, region, city
from geocity g
having distance <= radius
order by distance asc limit 100;
END
You may want to change the order clause, because I am not sure how you want to order it.

Get nearest location in database

I am trying to get the nearest location to a users input from within a database, (nearest store based on latitude and longitude), so based on the users postcode I am converting that to latitude and longitude and from these results I need to search my database to find the store that is the nearest to these values. I have the latitude and longitude of all stores saved and so far (from looking at previous questions) I have tried something like:
SELECT *
FROM mystore_table
WHERE `latitude` >=(51.5263472 * .9) AND `longitude` <=(-0.3830181 * 1.1)
ORDER BY abs(latitude - 51.5263472 AND longitude - -0.3830181) limit 1;
When I run this query, it does display a result, but it is not the nearest store, not sure if it could be something to do with the negative numbers, both my columns latitude + longitude are saved as decimal data types?
You have a logic operation in the order by rather than an arithmetic one. Try this:
SELECT *
FROM mystore_table
WHERE `latitude` >=(51.5263472 * .9) AND `longitude` <=(-0.3830181 * 1.1)
ORDER BY abs(latitude - 51.5263472) + abs(longitude - -0.3830181)
limit 1;
The AND in your original version would be producing a boolean value, either 0 or 1 -- and it would only be 1 when the values match exactly to the last decimal point. Not very interesting.
There are many reasons why this is not the nearest distance, but it might be close enough for your purposes. Here are some reasons:
Euclidean distance would take the square of the differences
Distance between two latitudes depends on the longitude (varying from about 70 miles on the equator to 0 at the poles).

Using WHERE clause to find POI within a range of distance from Longitude and Latitude

I'm using following sql code to find out 'ALL' poi closest to the set coordinates, but I would want to find out specific poi instead of all of them. When I try to use the where clause I get an error and it doesn't work and this is where I'm currently stuck, since I only use one table for all the coordinates off all poi's.
SET #orig_lat=55.4058;
SET #orig_lon=13.7907;
SET #dist=10;
SELECT
*,
3956 * 2 * ASIN(SQRT(POWER(SIN((#orig_lat -abs(latitude)) * pi()/180 / 2), 2)
+ COS(#orig_lat * pi()/180 ) * COS(abs(latitude) * pi()/180)
* POWER(SIN((#orig_lon - longitude) * pi()/180 / 2), 2) )) as distance
FROM geo_kulplex.sweden_bobo
HAVING distance < #dist
ORDER BY distance limit 10;
The problem is that you can not reference an aliased column (distancein this case) in a select or where clause. For example, you can't do this:
select a, b, a + b as NewCol, NewCol + 1 as AnotherCol from table
where NewCol = 2
This will fail in both: the select statement when trying to process NewCol + 1 and also in the where statement when trying to process NewCol = 2.
There are two ways to solve this:
1) Replace the reference by the calculated value itself. Example:
select a, b, a + b as NewCol, a + b + 1 as AnotherCol from table
where a + b = 2
2) Use an outer select statement:
select a, b, NewCol, NewCol + 1 as AnotherCol from (
select a, b, a + b as NewCol from table
) as S
where NewCol = 2
Now, given your HUGE and not very human-friendly calculated column :) I think you should go for the last option to improve readibility:
SET #orig_lat=55.4058;
SET #orig_lon=13.7907;
SET #dist=10;
SELECT * FROM (
SELECT
*,
3956 * 2 * ASIN(SQRT(POWER(SIN((#orig_lat -abs(latitude)) * pi()/180 / 2), 2)
+ COS(#orig_lat * pi()/180 ) * COS(abs(latitude) * pi()/180)
* POWER(SIN((#orig_lon - longitude) * pi()/180 / 2), 2) )) as distance
FROM geo_kulplex.sweden_bobo
) AS S
WHERE distance < #dist
ORDER BY distance limit 10;
Edit: As #Kaii mentioned below this will result in a full table scan. Depending on the amount of data you will be processing you might want to avoid that and go for the first option, which should perform faster.
The reason why you cant use your alias in the WHERE clause is the order in which MySQL executes things:
FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
When executing your WHERE clause, the value for your column alias is not yet calculated. This is a good thing, because it would waste a lot of performance. Imagine many (1,000,000) rows -- to use your calculation in the WHERE clause, each of those 1,000,000 would first have to be fetched and calculated so the WHERE condition can compare the calculation results to your expectation.
You can do this explicitly by either
using HAVING (thats the reason why HAVING has another name as WHERE - its a different thing)
using a subquery as illustrated by #MostyMostacho (will effectively do the same with some overhead)
put the complex calculation in the WHERE clause (will effectively give the same performance result as HAVING)
All those will perform almost equally bad: each row is fetched first, the distance calculated and finally filtered by distance before sending the result to the client.
You can gain much (!) better performance by mixing a simple WHERE clause for distance approximation (filtering rows to fetch first) with the more precise euclidian formula in a HAVING clause.
find rows that could match the #distance = 10 condition using a WHERE clause based on simple X and Y distance (bounding box) -- this is a cheap operation.
filter those results using the formula for euclidian distance in a HAVING clause -- this is an expensive operation.
Look at this query to understand what i mean:
SET #orig_lat=55.4058;
SET #orig_lon=13.7907;
SET #dist=10;
SELECT
*,
3956 * 2 * ASIN(SQRT(POWER(SIN((#orig_lat -abs(latitude)) * pi()/180 / 2), 2)
+ COS(#orig_lat * pi()/180 ) * COS(abs(latitude) * pi()/180)
* POWER(SIN((#orig_lon - longitude) * pi()/180 / 2), 2) )) as distance
FROM geo_kulplex.sweden_bobo
/* WHERE clause to pre-filter by distance approximation .. filter results
later with precise euclidian calculation. can use indexes. */
WHERE
/* i'm unsure about geo stuff ... i dont think you want a
distance of 10° here, please adjust this properly!! */
latitude BETWEEN (#orig_lat - #dist) AND (#orig_lat + #dist)
AND longitude BETWEEN (#orig_lon - #dist) AND (#orig_lon + #dist)
/* HAVING clause to filter result using the more precise euclidian distance */
HAVING distance < #dist
ORDER BY distance limit 10;
For those who are interested in the constant:
3956 is the radius of the earth in miles, so the resulting distance is measured in miles
6371 is the radius of the earth in kilometers, so use this constant to measure distance in kilometers
Find more information in the wiki about the Haversine formula

MySQL: Guarantee Results, Decimal Issue

I have a table with the following attributes:
MyTable:
- double longitude
- double latitude
- varchar place_id
- varcar geoJSON_string
Given some point having a longitude x and latitude y I need to select the k closest points.
I know I tack a LIMIT k on the end of the query, but is there a way I can guarantee at least k points from a database of ~250,000 records?
Also, how would I even query the decimal values? I need to select something similar to the following:
SELECT * FROM MyTable WHERE latitude=140.3**** and longitude=132.2**** LIMIT k;
SELECT * FROM MyTable WHERE latitude BETWEEN '140.3' AND '140.4' AND longitude BETWEEN '132.2' AND '132.3' LIMIT k;
or, if you need the best result :
SELECT * FROM MyTable ORDER BY ABS(ABS(searched-lon - longitude) - ABS(searched-lat - latitude)) ASC LIMIT k;