How can I sort/group Salesforce leads by geography? - gis

If I had lat/long data for all our leads in Salesforce, is there a way to write a query to group them, or say list all the leads within 10 miles of San Francisco, CA ?
[EDIT: Clarification]
I have thousands of leads with both a full address, and long/lats.
I want to build a query on these leads that will give me all of the leads near San Francisco, CA. This means doing GIS type work within salesforce.
I could of course filter specifically on city, or zipcodes or area code, but this presents some problems when trying to rollup a whole metro area.

Yes. You need to Reverse GeoCode them with a tool/service. In the past I have used Maporamas service but it was quite expensive and that was before Google maps and virtual earth existed so I am sure there is something cheaper(free) out there now.... Googling around I have found this and this
EDIT:
OK from What I understand you are trying to calculate the distance between 2 lat/long points. I would start by discounting the ones that where outside you sphere of (lets say) 10 miles. So from your central point you will want to get the the coordinates 10 miles, East, West, South and North. To do this you need to use the Great-circle distance formula.
From that point you have you Sales Force Data if you wish to break this data up further then you need to order the points by distance from the central point. To do this you need to use the Haversine formula
I am not sure what you language preference is so I just included some examples from SQL(mainly) and C#
Haversine Formula in C# and in SQL
Determine the distance between ZIP codes using C#
Great Circle SQL
Great Circle 2

Use GeoHash.org (either as a web service or implement the algorithm). It hashes your lat-long coords into a form that appears similar for nearby places. For example A may have a hash like "akusDf3af" and B might have a hash like "akusDf3b2" if they are nearby. Then do a SOQL query that looks for places starting with the same n characters as a known location. Your n will determine the radius of the lookup.

These are some great technical solutions that can provide very exact answers, but two things to consider:
geospatial proximity does not map neatly to responsibility
Ownership calculation seems to be done best through postal code lookups or other rules that don't allow for gaps or overlaps. Otherwise, you'll have two (or more) salespeople fighting over leads that are close to both of them, and ignore those leads that are far away from both of them.
So, if you're using geo-calculations like those above to assign ownership, just acknowledge the system will leak and create business rules to accomodate for that. But a simple postal lookup to define territories (as salesforce's own territory management feature does) might be better.
I'd suggest the problem we're trying to solve geospatially is not who owns which lead. Rather, given all the leads you own, which are nearby?
maps often offer more data per pixel than columnar reports
Again, geospatial data in a report may not be the best answer. A lead 50km away, but along a major road, is more interesting than another lead 50km away on the other side of a mountain or lake. Or a lead close to other leads is more interesting than a lead by itself.
A report can't show this, but a map can.
Salesforce has some great examples of Google Maps integrations. Instead of a columnar report called "My Nearby Leads", why not a visualforce page, with a google map inside? You're giving the user far more information than a columnar report could. They might like it better, and it's easier to implement than trying to calculate some of the equations above.
Just another perspective that may (or may not) be appropriate to the problem at hand.

This post is really old, but is showing up at the top of Google results, so I figured I would post some info to it anyways.
2 nice mapping tools are batchgeo.com and geocod.io. Geocod.io can even give you lat and long coordinates from an address.
If you just need a one time calculation, you can use Excel. Export all your leads with the lat and long. Then go to Google Maps and get the lat and long in decimal degrees for the city center of wherever you want to measure to.
Then use this formula in excel to calculate the distance between the coordinates in miles. Lat1dd and Long1dd are the coordinates for one point, and the lat2dd and long2dd are coordinate points for the other point.
=3963*ACOS(COS(RADIANS(90-lat1dd))*COS(RADIANS(90-lat2dd))+SIN(RADIANS(90-lat1dd))*SIN(RADIANS(90-lat2dd))*COS(RADIANS(long1dd-long2dd)))
After you run it, just sort the results from smallest to largest to get those results that are the closest.
I haven't done this next part yet, but conceptually it should work. We have a field that lists the major market each account is in. Example, Chicago IL. I am going to build a trigger or formula field that essentially says IF(Market="Chicago IL") then use X and Y for the lat and long. These will be hardcoded as the city center for that specific market. The query will then run each individual account's lat and long against the one from the city center to calculate a distance.
If you wanted to break the market into different zones, you could adjust your formula so it uses < and > on the lat and long fields. Everything less than X but greater than Y goes in Zone A, etc.
Hope this helps someone.

Related

Geonames vs Google Maps

I am building an application that uses both GeoNames and Google Places API. The thing is, when I do a search nearby by a specific location (say lat: 47.16, lng: 27.56) on both of the services I do not know how to remove entities that appear both in the results from Google Places and the results from GeoNames(findNearby). I was thinking about using location (latitude and longitude) but it isn't accurate enough. Also, the name varies considerably so this wouldn't work either. Another idea that crossed my mind would be using the types (feature codes for GeoNames and type for Google Places), but there are a lot of types and obviously I can not do a cross reference manually. Any ideas?
Note: I want to use both of them as this is a school project and the requirements specify using more than one source of info.
Thanks.
I think that, unfortunately, the reason you haven't received an answer is that there is no answer that would fully satisfy the requirements.
Even with one provider, a single coordinate could be associated with multiple results. Imagine a large building in New York, for example, where dozens of companies each occupying a floor or part of a floor in the same building, and yet they would all be associated with the same (latitude, longitude) coordinate.
Now consider two sources. Source A says there's a doctor at that location (let's say on the 7th floor). Source B says there'S a doctor too. Can we assume they're the same doctor? Nope. It might be another doctor on another floor. Or it could be the same doctor. It's impossible to tell. The point is, you could try to use feature codes / types to reduce the number of hits by assuming similar locations are the same location, but it's still an assumption.
Anyway, good luck with your assignment from 3 years ago. It was a good idea nonetheless. :)

Compiling a list of all colonies/neighborhoods for a particular city

I want a list of locations (coordinates) for all possible colonies/neighborhoods of some Indian cities. Take for example Delhi. Can this data be obtained with the Places API?
The only thing that comes to my mind is to use a query like -
https://maps.googleapis.com/maps/api/place/search/xml?location=28.540346,77.210026&radius=500&types=administrative_area_level_1|administrative_area_level_2|administrative_area_level_3|locality|neighborhood|street_address|sublocality|sublocality_level_4|sublocality_level_5|sublocality_level_3|sublocality_level_2|sublocality_level_1|subpremise&sensor=false&key=MYKEY
and then keep changing the radius by 500 till the whole city is covered.
Is there a better way of doing this?
Given how often you would need to do this for your map, since caching that data goes against the terms of service, this is not a great approach. If you map gets any decent usage, you'll rapidly hit your quota. Plus you're only get center points of the colonies/neighborhoods. I'd recommend trying to find another source of that data you can download. The Places API was not designed with this in mind.

How do I discern a country from X/Y Coordinates (or Long/Lat)?

I have a number of XY coordinates and am looking to discern which country each of these are in, who knows a good service/way of doing this?
I am working with MySQL & PHP, not that its really relevant, I am o fey with consuming web services/pages and assume there must be a web-service/page somewhere which will do this, if someone can point me in the right direction that would be awesome.
How do I take: 306458,383136 and turn it into: United Kingdom (for example.)
Appreciate your responses in advance.
What you're looking for is called reverse geocoding, and e.g. Google Maps has this functionality: http://code.google.com/apis/maps/documentation/geocoding/#ReverseGeocoding
It works from a Lat/Lon coordinate, and could return even a more precise information than only a country; note that this is only an estimate - in some places, country boundaries are somewhat tangled.
If you're looking to do this offline, or if Google's/Bing's/whoever's licensing is too strict for you (e.g. you need to do a gazillion of requests per day, or need to present the result in an unorthodox format), it's possible to run your own instance of Nominatim, feed it a data extract from OpenStreetMap (under ODbL, a much more permissive license), and query that.
For example, there's a set of boundaries available at https://wambachers-osm.website/boundaries/ - just the national boundaries, so you wouldn't need to download the entire planet map.

Calculating trip travel times using available geo APIs for 5k+ addresses

I'm working on a transportation model, and am about to do a travel time matrix between 5,000 points. Is there a free, semi-reliable way to calculate the travel times between all my nodes?
I think google maps has a limit on the number of queries / hits I can achieve.
EDIT
I'd like to use an api such as google maps or similar ones as they include data such as road directions, number of lanes, posted speed, type of road, etc ...
EDIT 2
Please be advised that openstreet map data is incomplete and not available for all jurisdictions outside the US
Google Directions API restricts you to 2500 calls per day. Additionally, terms of service stipulate that you must only use the service "in conjunction with displaying the results on a Google map".
You may be interested in OpenTripPlanner, an in-development project which can do multi-modal routing, and Graphserver on which OpenTripPlanner is built.
One approach would be to use OpenStreetMap data with Graphserver to generate Shortest Path Trees from each node.
As that's 12,502,500 total connections, I'm pretty sure you'll hit some sort of limit if you attempt to use Google maps for all of them. How accurate of results do you need/how far are you travelling?
I might try to generate a crude map with travel speeds on it (e.g. mark off interstates as fast, yadda yadda) then use some software to calculate how long it would take from point to point. One could visualize it as an electromagnetic fields problem, where you're trying to calculate the resistance from point to point over a plane with varying resistance (interstates are wires, lakes are open circuits...).
If you really need all these routes accurately calculated and stored in your database, it sounds like (and I would believe) that you are going to have to spend the money to obtain this. As you can imagine, this is expensive to develop and there should be renumeration.
I would, however, probe a bit about your problem:
Do you really need all 5000! distances in a database? What if you asked google for them as you needed them, and then cached them (if allowed). I've had web applications like this that because of the slow traffic ramp-up pattern, I was able to leverage free services early on to vet the idea.
Do you really need all 5000 points? Or could you pick the top 100 and have a more tractable problem?
Perhaps there is some hybrid where you store distances between big cities and do more estimates for shorter distances.
Again, I really don't know what your problem is, but maybe thinking a bit outside the box will help you find an easier solution.
You might have to go for some heuristics here. Maybe you can estimate travel time based on a few factors like geometric distance and some features about the start and end points (urban vs rural areas, country, ...). You could get a few distances, try to fit your parameters on a subset of them and see how well you're able to predict the other ones. My prediction would be, for example, that travel times approach linear dependence from distance as distance grows larger, in many cases.
I know it's messy, but hey you're trying to estimate 12.5mio datapoints (or whatever the amount :)
You might also be able to incrementally add knowledge from already-retrieved "real" travel times by finding close points to the ones you're looking for:
get closest points StartApprox, EndApprox to starting and end position such that you have a travel time between StartApprox and EndApprox
compute distances StartError, EndError between start and StartApprox, end and EndApprox
if StartError+EndError>Distance(StartApprox, EndApprox) * 0.10 (or whatever your threshold) -> compute distance via API (and store it), else use known travel time plus overhead time based on StartError+EndError
(if you have 100 addresses in NY and 100 in SF, all the values are going to be more or less the same (ie the difference between them is probably lower than the uncertainty involved in these predictions) and such an approach would keep you from issuing 10000 queries where 1 would do)
Many GIS software packages have routing algorithms, if you have the data... Transportation data can be fairly spendy.
There are some other choices of sources for planning routes. Is this something to be done repeatedly, or a one-time process? Can this be broken up into smaller sub-sets of points? Perhaps you can use multiple routing sources and break up the data points into segments small enough for each routing engine.
Here are some other choices from quick Google search:
Wikipedia
Route66
Truck Miles

Optimal map routing with Google Maps

Is there a way using the Google Maps API to get back an "optimized" route given a set of waypoints (in other words, a "good-enough" solution to the traveling salesman problem), or does it always return the route with the points in the specified order?
There is an option in Google Maps API DirectionsRequest called optimizeWaypoints, which should do what you want. This can only handle up to 8 waypoints, though.
Alternatively, there is an open source (MIT license) library that you can use with the Google Maps API to get an optimal (up to 15 locations) or pretty close to optimal (up to 100 locations) route.
See http://code.google.com/p/google-maps-tsp-solver/
You can see the library in action at www.optimap.net
It always gives them in order.
So I think you'd have to find the distance (or time) between each pair of points, one at a time, then solve the traveling salesman problem yourself. Maybe you could convince Google Maps to add that feature though. I guess what constitutes a "good enough" solution depends on what you're doing and how fast it needs to be.
Google has a ready solution for Travel Salesman Problem. It is OR-Tools (Google's Operations Research tools) that you can find here: https://developers.google.com/optimization/routing/tsp
What you need to do basically is 2 things:
Get the distances between each two points using Google Maps API: https://developers.google.com/maps/documentation/distance-matrix/start
Then you will feed the distances in an array to the OR-Tools and it will find a very-good solution for you (For certain instances with millions of nodes, solutions have been found guaranteed to be within 1% of an optimal tour).
You can also note that:
In addition to finding solutions to the classical Traveling Salesman
Problem, OR-Tools also provides methods for more general types of
TSPs, including the following:
Asymmetric cost problems — The traditional TSP is symmetric: the distance from point A to point B equals the distance from point B to
point A. However, the cost of shipping items from point A to point B
might not equal the cost of shipping them from point B to point A.
OR-Tools can also handle problems that have asymmetric costs.
Prize-collecting TSPs, where benefits accrue from visiting nodes
TSP with time windows
Additional links:
OR-tools at Github: https://github.com/google/or-tools
Get Started: https://developers.google.com/optimization/introduction/get_started
In a typical TSP problem, the assumption is one can travel directly between any two points. For surface roads, this is never the case. When Google calculates a route between two points, it does a heuristic spanning tree optimization, and usually comes up with a fairly close to optimal path.
To calculate a TSP route, one would first have to ask Google to calculate the pair-wise distance between every node in the graph. I think this requires n*(n-1) / 2 calcs. One could then take those distances and perform a TSP optimization on them.
OpenStreetMaps.org has a Java WebStart application which may do what you want. Of course the calculations are being run client side. The project is open source, and may be worth a look.
Are you trying to find an optimal straight line path between locations, or the optimal driving route? If you just want to order the points, if you can get the GPS coordinates, it becomes a very easy problem.
Just found http://gebweb.net/optimap/ It looks nice and easy. Online version using google maps.