Select data based on calculated distance between coordinates

Select data based on calculated distance between coordinates - sql-server-2008

Database stores 4 points with coordinates like:
Name | Lat | Long
Point 1 | 11.111 | 22.222
Point 2 | 22.222 | 33.333
Point 3 | 44.444 | 55.555
Point 4 | 66.666 | 77.777
Technology:
MS SQL Server
Web application gets current user lat and long via HTML5, then it should calculate what points of those 4 are nearer then 0.5 km. How?
It should display Point 1 and Point 2 based on this illustration:

Using Sql Server:
You can find distance between two coordinates in KiloMetres using the below function
CREATE FUNCTION dbo.fnCalcDistanceKM(#lat1 FLOAT, #lat2 FLOAT, #lon1 FLOAT, #lon2 FLOAT)
RETURNS FLOAT
AS
BEGIN
RETURN ACOS(SIN(PI()*#lat1/180.0)*SIN(PI()*#lat2/180.0)+COS(PI()*#lat1/180.0)*COS(PI()*#lat2/180.0)*COS(PI()*#lon2/180.0-PI()*#lon1/180.0))*6371
END
Sample Usage:
select [dbo].[fnCalcDistanceKM](13.077085,80.262675,13.065701,80.258916)
Reference
Using Entity Framework(dot net):
Entity framework 5.0 allows you to write LINQ expression like this
private Facility GetNearestFacilityToJobsite(DbGeography jobsite)
{
var q1 = from f in context.Facilities
let distance = f.Geocode.Distance(jobsite)
where distance < 500 * 1609.344
orderby distance
select f;
return q1.FirstOrDefault();
}
Reference
I hope this is enough to get you started.

You need to convert your point to the geography data type. Then you can do a WHERE #here.STDistance(testPoint) < 500.
The basics of using the geogrpahy point to calculate distance can be found in this question.

Related

SQL - Agg Func Manhattan Distance

SO Link doesn't answer the question. I can't figure out how to solve this query on Hackerspace. None of the solutions online seem to be working. Is this a bug or am I doing something wrong?
Consider P1(a,b) and P2(c,d) to be two points on a 2D plane.
a happens to equal the minimum value in Northern Latitude (LAT_N in STATION).
b happens to equal the minimum value in Western Longitude (LONG_W in STATION).
c happens to equal the maximum value in Northern Latitude (LAT_N in STATION).
d happens to equal the maximum value in Western Longitude (LONG_W in STATION).
Query the Manhattan Distance between points and and round it to a scale of decimal places.
Input Format
The STATION table is described as follows:
STATION Table
ID | Number
City | VarChar2(21)
State | VarChar2(2)
LAT_N | Number
LONG_W | Number
Database: MySQL
Source: https://www.hackerrank.com/challenges/weather-observation-station-18/problem
Link: distance between two longitude and latitude (Tried, but none of the answers provided work.)
SELECT ROUND(ABS(MIN(Station.LAT_N) - MIN(Station.LONG_W)) + ABS(MAX(Station.LAT_N) - MAX(Station.Long_W)), 4)
FROM Station;

The formula for manhattan distance is | a - c| + | b - d| where a and b are min lat and long and c and d are max lat and long respectively.
select
round(
abs(
min(lat_n)- max(lat_n)
) + abs(
min(long_w)- max(long_w)
), 4
)
from
station;
I got 25 hakker points! so can I get 25 points from you?

Without just writing the answer: you need to calculate the horizontal difference between the min and max longitude, and add the vertical difference between the min and max latitude.
Your code does something a bit different. If you update your code accordingly, then the rest is OK and will be marked as correct by hackerrank.

You are comparing latitude and longitude when instead you need to compare latitude with latitude and longitude with longitude. The Manhattan distance between (1,3) and (2,4) is |1-2|+|3-4|, not |1-4|+|2-3|.
It should also be pointed out that since you're taking the min and max of the same range, you don't actually need the absolute value function. round(max(x)-min(x)+max(y)-min(y), 4) works perfectly well - and is slightly faster.

My answer for MS SQL
SELECT CAST(
ABS(MAX(LAT_N) - MIN(LAT_N)) + ABS(MAX(LONG_W) - MIN(LONG_W))
AS DECIMAL(20, 4))
FROM STATION

select round((max(lat_n)-min(lat_n)),4)+round((max(long_w)-min(long_w)),4)
from station;
As we will get result from diff of max and min we don't need abs.
The above code works for Sql Problem

SELECT ROUND(ABS(MAX(Station.LAT_N) - MIN(Station.LONG_W)) + ABS(MIN(Station.LAT_N) - MAX(Station.Long_W)), 4)
FROM Station;enter image description here

Adding numbers with decimal returns rounded number in MySQL

I have a table with three column:
Source / Target / Weight
x / y / 0.2
x / y / 0.2
z / a / 0.5
The "weight" column is as "float." I am running a select to group all of the duplicates and add the "weight" scores together. Here is the query:
SELECT source, target, sum(weight) as weight2
FROM mytable
GROUP BY source, target
Oddly, after I run the query, it seems that any value below 1 in the "weight" section (e.g. 0.2) is rounded to 1. So I obtain the following table:
Source / Target / Weight
x / y / 2
z / a / 1
Where the scores should have been 0.4 and 0.5. What am I doing wrong?

I just ran this on my instance of MySQL 5.5.30:
mysql> create table mytable (source char(1), target char(1), weight float);
mysql> insert into mytable values
-> ('x','y',0.2),
-> ('x','y',0.2),
-> ('z','a',0.5);
mysql> SELECT source, target, sum(weight) as weight2
-> FROM mytable
-> GROUP BY source, target;
+--------+--------+--------------------+
| source | target | weight2 |
+--------+--------+--------------------+
| x | y | 0.4000000059604645 |
| z | a | 0.5 |
+--------+--------+--------------------+
MySQL does not do rounding up to 1 as you describe. All I can guess is that you rounded up the values as you inserted them. I would recommend double-checking the data without doing a SUM() or GROUP BY, to see what the values are.
You may notice that in my output above, the SUM on the first row is not exactly 0.4, but instead it's a floating-point value near 0.4. You should probably not use FLOAT if you are concerned about rounding errors.
Read What Every Computer Scientist Should Know About Floating-Point Arithmetic, by David Goldberg.
Or a shorter treatment of this issue in the MySQL manual: Problems with Floating-Point Values.

MySql - AVG() and STD() function , weird results...

I'm going crasy with the results from MySql regarding standard functions:
- AVG() the average
- STD() the standard deviation
Check the following results from my table 'Auction':
mysql> SELECT avg(buyout) avg FROM auction where buyout <> 0 and item =72988;
+-------------+
| avg |
+-------------+
| 234337.3622 |
+-------------+
That result looks correct, no issue.
But when I run std:
mysql> SELECT std(buyout) std FROM auction where buyout <> 0 and item =72988;
+-------------+
| std |
+-------------+
| 574373.6098 |
+-------------+
! The SDT is greater than AVG (SDT > AVG), and that's... impossible because my AVG>0.
Where am I wrong here ... ?
thx in advance !

There is no mathematical constraint saying that if mean is positive it has to be smaller than the standard deviation.
I read the extract of your data in R
data <- read.table("extract_72988.csv", h=1, sep="\t")
And confirmed that
> mean(data$BUYOUT)
[1] 234337.4
> sd(data$BUYOUT)
[1] 574421.3
Further analysis of your data shows that it is far from being normally distributed
Here is an histogram of your data:
And here is the histogram of log-transformed data
And finally a normal Q-Q plot

Or said differently, we are looking at auction prices. Each price in the database is a positive value. Our mean is not reduced nor centered, and is around 2.35, but the computation of the st.dev returns a higher value than 2.35. If we put this result as a graph, it would mean that the prices move around the mean of a value greater than the mean itself, If we draw this standard deviation "to the left" from our mean, then it would say that there is probability to find a NEGATIVE price -> impossible !
Right ?

Why doesn't this sql query return any results comparing floating point numbers?

I have this in a mysql table:
id and bolag_id are int. lat and lngitude are double.
If I use the the lngitude column, no results are returned:
lngitude Query: SELECT * FROM location_forslag WHERElngitude= 13.8461208
However, if I use the lat column, it does return results:
lat Query: SELECT * FROM location_forslag WHERElat= 58.3902782
What is the problem with the lngitude column?

It is not generally a good idea to compare floating point numbers with = equals operator.
Is it correct to compare two rounded floating point numbers using the == operator?
Dealing with accuracy problems in floating-point numbers
For your application, you need to consider how close you want the answer to be.
1 degree is about 112km, and 0.00001 degrees is about 1.1 metres (at the equator). Do you really want your application to say "not equal" if two points are different by 0.00000001 degrees = 1mm?
set #EPSLION = 0.00001 /* 1.1 metres at equator */
SELECT * FROM location_forslag
WHERE `lngitude` >= 13.8461208 -#EPSILON
AND `lngitude` <= 13.8461208 + #EPSILON
This will return points where lngitude is within #epsilon degrees of the desired value.
You should choose a value for epsilon which is appropriate to your application.

Floating points are irritating....
WHERE ABS(lngitude - 13.8461208) < 0.00000005

Convert float to decimal for compare. I had the same problem and solved like this:
SELECT
[dbo].[Story].[Longitude],
[dbo].[Story].[Latitude],
[dbo].[Story].[Location],
FROM
[dbo].[Story],
[dbo].[Places]
WHERE
convert(decimal, [dbo].[Story].[Latitude]) = convert(decimal, [dbo].[Places].[Latitude])
and
convert(decimal, [dbo].[Story].[Longitude]) = convert(decimal, [dbo].[Places].[Longitude])
and
[dbo].[Places].[Id] = #PlacesID
and
[dbo].[Story].IsDraft = 0
ORDER BY
[dbo].[Story].[Time] desc
Look at the first 3 rows after the WHERE clausule.
Hope it helps.

SQL Query For Total Points Within Radius of a Location

I have a database table of all zipcodes in the US that includes city,state,latitude & longitude for each zipcode. I also have a database table of points that each have a latitude & longitude associated with them. I'd like to be able to use 1 MySQL query to provide me with a list of all unique city/state combinations from the zipcodes table with the total number of points within a given radius of that city/state. I can get the unique city/state list using the following query:
select city,state,latitude,longitude
from zipcodes
group by city,state order by state,city;
I can get the number of points within a 100 mile radius of a specific city with latitude '$lat' and longitude '$lon' using the following query:
select count(*)
from points
where (3959 * acos(cos(radians($lat)) * cos(radians(latitude)) * cos(radians(longitude) - radians($lon)) + sin(radians($lat)) * sin(radians(latitude)))) < 100;
What I haven't been able to do is figure out how to combine these queries in a way that doesn't kill my database. Here is one of my sad attempts:
select city,state,latitude,longitude,
(select count(*) from points
where status="A" AND
(3959 * acos(cos(radians(zipcodes.latitude)) * cos(radians(latitude)) * cos(radians(longitude) - radians(zipcodes.longitude)) + sin(radians(zipcodes.latitude)) * sin(radians(latitude)))) < 100) as 'points'
from zipcodes
group by city,state order by state,city;
The tables currently have the following indexes:
Zipcodes - `zip` (zip)
Zipcodes - `location` (state,city)
Points - `status_length_location` (status,length,longitude,latitude)
When I run explain before the previous MySQL query here is the output:
+----+--------------------+----------+------+------------------------+------------------------+---------+-------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+------+------------------------+------------------------+---------+-------+-------+---------------------------------+
| 1 | PRIMARY | zipcodes | ALL | NULL | NULL | NULL | NULL | 43187 | Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | points | ref | status_length_location | status_length_location | 2 | const | 16473 | Using where; Using index |
+----+--------------------+----------+------+------------------------+------------------------+---------+-------+-------+---------------------------------+
I know I could loop through all the zipcodes and calculate the number of matching points within a given radius but the points table will be growing all the time and I'd rather not have stale point totals in the zipcodes database. I'm hoping a MySQL guru out there can show me the error of my ways. Thanks in advance for your help!

MySQL Guru or not, the problem is that unless you find a way of filtering out various rows, the distance needs to be calculated between each point and each city...
There are two general approaches that may help the situation
make the distance formula simpler
filter out unlikely candidates to the 100k radius from a given city
Before going into these two avenue of improvement, you should decide on the level of precision desired with regard to this 100 miles distance, also you should indicate which geographic area is covered by the database (is this just continental USA etc.
The reason for this is that while more precise numerically, the Great Circle formula, is very computationally expensive. Another avenue of performance improvement would be to store "Grid coordinates" of sorts in addtion (or instead of) the Lat/Long coordinates.
Edit:
A few ideas about a simpler (but less precise) formula:
Since we're dealing with relatively small distances, (and I'm guessing between 30 and 48 deg Lat North), we can use the euclidean distance (or better yet the square of the euclidean distance) rather than the more complicated spherical trigonometry formulas.
depending on the level of precision expected, it may even be acceptable to have one single parameter for the linear distance for a full degree of longitude, taking something average over the area considered (say circa 46 statute miles). The formula would then become
LatDegInMi = 69.0
LongDegInMi = 46.0
DistSquared = ((Lat1 - Lat2) * LatDegInMi) ^2 + ((Long1 - Long2) * LongDegInMi) ^2
On the idea of a columns with grid info to filter to limit the number of rows considered for distance calculation.
Each "point" in the system, be it a city, or another point (?delivery locations, store locations... whatever) is assigned two integer coordinate which define the square of say 25 miles * 25 miles where the point lies. The coordinates of any point within 100 miles from the reference point (a given city), will be at most +/- 4 in the x direction and +/- 4 in the y direction. We can then write a query similar to the following
SELECT city, state, latitude, longitude, COUNT(*)
FROM zipcodes Z
JOIN points P
ON P.GridX IN (
SELECT GridX - 4, GridX - 3, GridX - 2, GridX - 1, GridX, GridX +1, GridX + 2 GridX + 3, GridX +4
FROM zipcode ZX WHERE Z.id = ZX.id)
AND
P.GridY IN (
SELECT GridY - 4, GridY - 3, GridY - 2, GridY - 1, GridY, GridY +1, GridY + 2 GridY + 3, GridY +4
FROM zipcode ZY WHERE Z.id = ZY.id)
WHERE P.Status = A
AND ((Z.latitude - P.latitude) * LatDegInMi) ^2
+ ((Z.longitude - P.longitude) * LongDegInMi) ^2 < (100^2)
GROUP BY city,state,latitude,longitude;
Note that the LongDegInMi could either be hardcoded (same for all locations within continental USA), or come from corresponding record in the zipcodes table. Similarly, LatDegInMi could be hardcoded (little need to make it vary, as unlike the other it is relatively constant).
The reason why this is faster is that for most records in the cartesian product between the zipcodes table and the points table, we do not calculate the distance at all. We eliminate them on the basis of a index value (the GridX and GridY).
This brings us to the question of which SQL indexes to produce. For sure, we may want:
- GridX + GridY + Status (on the points table)
- GridY + GridX + status (possibly)
- City + State + latitude + longitude + GridX + GridY on the zipcodes table
An alternative to the grids is to "bound" the limits of latitude and longitude which we'll consider, based on the the latitude and longitude of the a given city. i.e. the JOIN condition becomes a range rather than an IN :
JOIN points P
ON P.latitude > (Z.Latitude - (100 / LatDegInMi))
AND P.latitude < (Z.Latitude + (100 / LatDegInMi))
AND P.longitude > (Z.longitude - (100 / LongDegInMi))
AND P.longitude < (Z.longitude + (100 / LongDegInMi))

When I do these type of searches, my needs allow some approximation. So I use the formula you have in your second query to first calculate the "bounds" -- the four lat/long values at the extremes of the allowed radius, then take those bounds and do a simple query to find the matches within them (less than the max lat, long, more than the minimum lat, long). So what I end up with is everything within a square sitting inside the circle defined by the radius.

SELECT * FROM tblLocation
WHERE 2 > POWER(POWER(Latitude - 40, 2) + POWER(Longitude - -90, 2), .5)
where the 2 > part would be the number of parallels away and 40 and -90 are lat/lon of the test point
Sorry I didn't use your tablenames or structures, I just copied this out of one of my stored procedures I have in one of my databases.
If I wanted to see the number of points in a zip code I suppose I would do something like this:
SELECT
ParcelZip, COUNT(LocationID) AS LocCount
FROM
tblLocation
WHERE
2 > POWER(POWER(Latitude - 40, 2) + POWER(Longitude - -90, 2), .5)
GROUP BY
ParcelZip
Getting the total count of all locations in the range would look like this:
SELECT
COUNT(LocationID) AS LocCount
FROM
tblLocation
WHERE
2 > POWER(POWER(Latitude - 40, 2) + POWER(Longitude - -90, 2), .5)
A cross join may be inefficient here since we are talking about a large quantity of records but this should do the job in a single query:
SELECT
ZipCodes.ZipCode, COUNT(PointID) AS LocCount
FROM
Points
CROSS JOIN
ZipCodes
WHERE
2 > POWER(POWER(Points.Latitude - ZipCodes.Latitude, 2) + POWER(Points.Longitude - ZipCodes.Longitude, 2), .5)
GROUP BY
ZipCodeTable.ZipCode

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Select data based on calculated distance between coordinates - sql-server-2008

You need to convert your point to the geography data type. Then you can do a WHERE #here.STDistance(testPoint) < 500. The basics of using the geogrpahy point to calculate distance can be found in this question.

Related

SQL - Agg Func Manhattan Distance

Adding numbers with decimal returns rounded number in MySQL

MySql - AVG() and STD() function , weird results...

Why doesn't this sql query return any results comparing floating point numbers?

SQL Query For Total Points Within Radius of a Location

Categories

Resources