A custom MySQL function to calculate the Haversine distance?

A custom MySQL function to calculate the Haversine distance? - mysql

I'm building a 'find my nearest' script whereby my client has provided me with a list of their locations. After some research, I determined that the way to do this was to geocode the address/postcode given by the user, and use the Haversine formula to calculate the distance.
Formula wise, I got the answer I was looking for from this question (kudos to you guys). So I won't repeat the lengthy query/formula here.
What i'd like to have been able to do though, as an example - is something like:
SELECT address, haversine(#myLat,#myLong,db_lat,db_long,'MILES') .....
This would be just easier to remember, easier to read later, and more re-usable by copying the function into future projects without having to relearn / re-integrate the big formula. Additionally, the last argument could help with being able to return distances in different units.
Is it possible to create a user MySQL function / procedure to do this, and how would I go about it? (I assume this is what they are for, but i've never needed to use them!)
Would it offer any speed difference (either way) over the long version?

Yes, you can create a stored function for this purpose. Something like this:
DELIMITER //
DROP FUNCTION IF EXISTS Haversine //
CREATE FUNCTION Haversine
( myLat FLOAT
, myLong FLOAT
, db_lat FLOAT
, db_long FLOAT
, unit VARCHAR(20)
)
RETURNS FLOAT
DETERMINISTIC
BEGIN
DECLARE haver FLOAT ;
IF unit = 'MILES' --- calculations
SET haver = ... --- calculations
RETURN haver ;
END //
DELIMITER ;
I don't think it offers any speed gains but it's good for all the other reasons you mention: Readability, reusability, ease of maintenance (imagine you find an error after 2 years and you have to edit the code in a (few) hundred places).

Related

(SICP) What is the difference between functions and procedures?

I am currently working my way through Structure and Interpretation of Computer Programs doing both the book and lectures from Brian Harvey(who is hilarious at points), however I have yet to truly have my "aha! moment" with differentiating functions and procedures.
Now I have done my research outside of the lectures and readings and came across a few different posts regarding this same question, but all seem to branch out into separate discussions/opinions on the true interpretation or outdated definitions. A general answer I have seen is that functions return a value and procedures do not, however that does not clear up much for me and most users on that response seemed to have some arguments against that answer.
Diving into higher order procedures within the text and lectures I completely understand the concept and the power this makes available, however I am confused because I will hear "Higher Order Procedures" and "Higher Order Functions". Brian Harvey also mentioned that "A higher order procedure represents a higher order function".
I understand that the two functions below are the same function, but different procedures.
f(x) = 2x + 6
g(x) = 2(x + 3)
Now below, make-adder is referred to as a procedure with a num as it's formal parameter. The domain of make-adder is numbers, the range is procedures. I guess what is really stumping me is he refers to the lambda expression as exactly that, a lambda expression, but make-adder is returning a procedure?
(define (make-adder num)
(lambda (x) (+ x num))
(define plus3 (make-adder 3))
(plus3 8)
I thought I had a clear understanding until a few references to procedures during the higher order procedures lessons and that has since fogged things up.
Any help differentiating the two with a possible example? Thank you!

TL; DR: A procedure and function means the same in the context of SICP.
In mathematics a function is something that you apply with arguments and returns a value and would always return the same values to the same arguments. You could replace it with a map between the arguments to the result.
In programming languages like Scheme or JavaScript the use of the word function is not correct for all code that has some kind of side effects or where the return does isn't consistent with the arguments.
A procedure is a more generic term so you cannot say that a procedure needs to have referential transparency so that it can be seen as a mathematical function and thus both Scheme and JavaScript has procedures and not functions. Eg. subroutine in x86 intel platform is a procedure. It allows no arguments and not return value, just jump and return. However C uses code to manipulate the stack to be able to pass arguments and get a return value and in that sense you can emulate "a function", but they did not remove the possibility of the return not being the same for every input and thus you can implement a "c function" that is not a function, but you can call it a procedure.

Is there a similar function like mt_rand for MYSQL

Is there a similar function like mt_rand for MYSQL? I've searched everywhere but can't seem to find one.

DELIMITER $$
CREATE FUNCTION `my_rand`(`arg_min` INT, `arg_max` INT) RETURNS int(11)
BEGIN
RETURN ROUND( arg_min + RAND( ) * (arg_max-arg_min) );
END
invocation:
SELECT my_rand(10,15)
will output one integer number between 10 and 15.
As a sidenote... if you wanted the benefits of mt_rand, intending 'Marsenne Twister', then I fear you should look for an UDF implementation (maybe you should check this Statistics for mySQL).
If instead you look for a function that simply spits out a random integer between min and max, this should do the trick.
also, if we used FLOOR as a rounding function, it would have been impossible to get 15 as a result of my example.
this is because, from the docs:
RAND() Returns a random floating-point value v in the range 0 <= v <
1.0.
Reading immediately after RAND, you can find ROUND in the docs... it states that ROUND has pretty implementation-dependant behaviour for rounding, so its behaviour may change (and has changed) between mysql versions. This is probably the reason why the docs suggest rounding RAND with FLOOR instead of ROUND. So, basically it seems debatable which choice is best... "FLOOR or ROUND, make your choice..."
Finally... take a look at this article. It may seem scary but at least it makes it clear that the question was 'difficult', indeed.
Order by RAND() by Jan Kneschke

Iterate over characters in string in mysql

First at all I have a very concrete question, but maybe an alternative approach to my problem (second part) could also help me.
Is there a way to address a character in a string via its index in mysql. (i.e. in PHP $var[2] will give you the 3rd charater)?
The obvious way is SUBSTRING(var, 3,1 ) but since my strings are 1024 character long I assume this is not the fastest solution. As displayed in the code sample using substring to retrieve the tail of the string also gain no performance difference. Is there maybe a way to iterate over a string? (Shift the first element?)
CREATE FUNCTION hashDiff( hash1 TEXT(1024), hash2 TEXT(1024), threshold INT)
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE diff, x, b1, b2 INT;
SET diff =0;
SET x = 0;
WHILE (x<1024 AND diff<threshold) DO
SET b1 = ASCII(hash1); --uses first character only!!
SET b2 = ASCII(hash2);
SET hash1=SUBSTRING(hash1, 2 );
SET hash2=SUBSTRING(hash2, 2 );
SET diff=diff+ ((b1-b2)*(b1-b2));
SET x=x+1;
END WHILE;
RETURN diff;
END
If you not already read it from the code, I try to write a stored procedure to calculate the difference or distance between to hashes. The difference is the sum of the character-wise square distances (i.e. hashDiff(AA,AC)=(65-65)²+(65-67)²=4). The first major performance boost could be achieved by introducing a threshold to cancel the calculation if the hashes are already to different. But since mysql is not my "every day" language, I stuck at this point in finding other optimizations. For completeness two sample hashes:
YAAAAAAYAAAYAAVAAQAARAOAAOAQASAQAMAKAKAJIAJAJIAHAHIAKJAIIAHHAHIIAIHGAGFFAGGFEAFEEEEAEDDDDDAEEEEDEEEFAFFFFFFEFFFEFFFFFGFEEFFEEEFFFJEFFEEEEEEELFFFFEEFJEEEEDIEEEEEIEEEEHEEEJEEFKFEFKGGFNHGOIIJTJKYONYNMTGHNHHQISJJQIKWLXJJSMYRQWJOGKDDFCCBBAAAAAAAAAAAAAAAAAAAAAAAYAAAAAAYAAAYAAWAARAASASAAQARAUAYAYATAOALKAJAJIAIAHHAHGAGFAFFAEFFAEFFAFFFAEEFFAFEEEDADEDDDDADDDCDDDDDAEEEFEEEEDDDEEEDEDDEEEFEFFGGFMFHGFFFGFFFLGHGGHGGNHHGGGOHGHGHMGGFGMFFFMFGFLFFFMGFFMGGMGGGNGGMGGLGGLGGMGGLEIEEHDCGCGCDGDGDCGDFCECCECECECECFCECFCFCFCFCFCGCJGYCYAAAAAAYAAAYAAUAATAAUAUAAUARARAQAPAPASARRAPARQAPAQQAQQAQSAKMATKKAIIHAIHGAGGGGAGHHGGAGGFGFFAFFGEFFFFFAFFGFGGGFFFEEFGFFGGFGGHIJJLKLWLKJJIJJJKJRLJKLKKKUKLLKKUMMKJIQIIIISKJJWKLLXMLMYMLNYMMYMLLWJIQIINFGKFFKEEIDHEDHDDFCECCFDECCFCFDGCDGCGCGEGCDCECECFDFCGDGCIEKEOAYNFBREUXKPQMMQTKTMMNJLPPVYYYTOUOPOLLJKKJJJIJIMJJJLIJJLLJIIHHIHHHIGHIHIHJHHHJHHIHGHGHFGHGFFEFEEEFEFEFFGGHIHIHGHGHHIIIIHIIJMNLONKLKKKKKKKMLKKLONMKOOOMLOPONMNMKKLLKKLMNKLMMMNMOPPOORPORSSVRTSSRTRRTSSTTXSTQRPONOKKLKLJMKJJIJIIHHHIIIJHIJIJJIJIKJIMWMYYDAAAAAAAAAAA
AAAAAAAAAAABAABAACAACACAACADADAEADADADADDAEAEEAEAFEAEEAEFAFGAGGGAGGGAHHHAHIIIAIHIJHAIIHIHHAJIHIJIJKJAJJJIKJJJJKKJKJKKLKLKLLMMMNNMYOOOOOOPOONYOONONNPYNOOOPYOOPPPYNONNYMLLWLLKUJIISHIHOGGMFGFLFFMGGLFGLGFLFFKFKFFLEEKFLEFJFKFGNGNHLFHJFIEGDIEKGOIRFGBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABABACACDACADDACADDADDAEDAFEAEEFAFFFAFFGAFGGGAIGHHIAHIHHHHAHIHIIJJIIAIIIIJIJKJIIIIJJHIIHIIIIJIIIIRJJJJKJJJJLVKLLKLLKXLMMKMXMLLLMWMMMMYMNLYMNNYNNMYMMNYMLYLMLXKJRIHPHIMGGMFEJEJEEIEEHDGCDFCFDCFCECECCEBEBECFDGCFDNGLDBAAAAAAAAAAAAAAAAAAAAAABAAAAABAABABAACACACACACACACADDADAEEAFAFGAFGAHGAGGAGGHAGGIAIHJAJJJJAJKKKKAMLMNNNANOMMNNMMNAONMNOOOMOOPOMNOMMNPOOPPPPRQQYPPRPPPPPNOYLLMMMMLYLMLMLYLMLMMYLNNMYNLLWMLKXLLLUKIKQIIQGHHPFHNGFLFFLGFJEEJEIDDIDCHDFCDGCFCCFCECECCECFCGDGDHDHDIFIDEBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAABBBBBCBCCCCDCCCCCCCCDDDDEDEEEEFFDEGGHGHHHGHHHHHHIIJJJJJIJJJJJJIKJJKLKKMMNMMMMMMMNNNNNNLNNONPONNNOOOOPQQQRSSSSSSUTSTUUUVWVVXUYXWVXVXWYVYWYVYYUWVUTTSSPQPQOPOPONONOMONOOONNNMMNLJJKJIIJHHGGGFHFGFFFFEEEDDEEEEFGGIGJLRNEAAAAAAAAAAAAA
Any help or hint would be appreciated.

The only way you would be able to use an array of sorts would be to use a temporary tables and cursors/resultsets.
The problem is you will still need to iterate over the strings and use substring to break them apart. To my knowledge there is no 'wordwrap' or 'explode' function to chop up the string.

Finding the Maximum

How to find the following Maximum or supremum by computer software such as Mathematica and Matlab: $\sup\frac{(1+s)^{4}+(s+t)^{4}+t^{4}}{1+s^{4}+t^{4}}$?
Instead of numerical approximation, what is the accurate maximum?
Thanks.

Since the question seems a bit like homework, here's an answer that starts a bit like a lecture:
ask yourself what happens to the function as s and t go to small and to large positive and negative values; this will help you to identify the range of values you should be examining; both Mathematica and Matlab can help your figure this out;
draw the graph of your function over the range of values of interest, develop a feel for its shape and try to figure out where it has maxima; for this the Mathematic Plot3D[] function and the Matlab plot() function will both be useful;
since this is a function of 2 variables, you should think about plotting some of its sections, ie hold s (or t) constant, and make a 2D plot of the section function; again, develop some understanding of how the function behaves;
now you should be able to do some kind of search of the s,t values around the maxima of the function and get an acceptably accurate result.
If this is too difficult then you could use the Mathematica function NMaximize[]. I don't think that Matlab has the same functionality for symbolic functions built-in and you'll have to do the computations numerically but the function findmax will help.

In Matlab, one would create a vector/matrix with s and t values, and a corresponding vector with the function values. Then you can pinpoint the maximum using the function max
In Mathematica, use FindMaximum like this:
f[s_,t_]:= ((1+s)^4 + (s+t)^4 + t^4)/(1+s^4+t^4)
FindMaximum[ f[s,t],{s,0},{t,0} ]
This searches for a maximum starting from (s,t)=(0,0).
For more info, see http://reference.wolfram.com/mathematica/ref/FindMaximum.html

Find Stores by Location

I have a list of about 60 stores with physical addresses and geocodes. I would like to make a simple interface where a user can enter a zip code, or city and state, or even a street address and have the list sort in order of proximity to the entered address. This is a very common feature of websites as I understand.
My plan to do this is to use the Google Maps API to find the geocode of the entered location and use the Pythagorean Theorem to calculate the distance from each location and sort the list by the distances and return the result (or maybe the top 5 of the result set...).
Is this the correct way to do this? Is there a more optimal method or a function built-in the Google Maps API that will do this? Since this is something so common, I would imagine there has been tried and tested in many ways and there are probably several correct answers. I am just looking for some advice if I am going about this the correct way.
Thank you.

The Pythagorean Theorem will not be enough. The curve of the Earth makes this so. It requires a bit of Euclidian geometry. The formula and a simple implementation for finding the distance between two points on Earth - as the bird flies and not actual travel distance - is (in PHP):
// pass the latitudes and longitudes in as degrees
function getDistance($lat1,$long1,$lat2,$long2)
{
$r = 3963.1; //3963.1 statute miles; 3443.9 nautical miles; 6378 km
$pi = pi();
// convert the degrees to radians
$lat1 = $lat1*($pi/180);
$lat2 = $lat2*($pi/180);
$long1 = $long1*($pi/180);
$long2 = $long2*($pi/180);
$ret = (acos(cos($lat1)*cos($long1)*cos($lat2)*cos($long2) + cos($lat1)*sin($long1)*cos($lat2)*sin($long2) + sin($lat1)*sin($lat2)) * $r) ;
return $ret;
}
You could incorporate a version of this in your code. In addition here is a possible (untested) function that is a derivative of another one I have used for MySQL.
DELIMITER $$
DROP FUNCTION IF EXISTS `FindDist` $$
CREATE FUNCTION `FindDist` (lt1 DOUBLE,lg1 DOUBLE,lt2 DOUBLE,lg2 DOUBLE) RETURNS DOUBLE
DETERMINISTIC
BEGIN
DECLARE dist,eradius DOUBLE;
SET eradius=3963.1;
SET dist=Acos(Cos(lt1) * Cos(lg1) * Cos(lt2) * Cos(lg2) + Cos(lt1) * Sin(lg1) * Cos(lt2) * Sin(lg2) + Sin(lt1) * Sin(lt2)) * eradius;
RETURN dist;
END $$
DELIMITER ;

I found this link a while back when I was researching something similar. It uses .NET but the principles would apply to any language/framework.
Store Locator: Help customers find you with Google Maps
The key part of the solution is using the Haversine Formula to find the distance between two points specified as longitude and latitude. There's a C# implementation of this formula linked to in the above article here:
Distance between locations using latitude and longitude (CodeProject)
A bit more rooting around revealed:
Calculate Distance Between Two Points on a Globe in 9 Different Languages

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008