I'm working on a transportation model, and am about to do a travel time matrix between 5,000 points. Is there a free, semi-reliable way to calculate the travel times between all my nodes?
I think google maps has a limit on the number of queries / hits I can achieve.
EDIT
I'd like to use an api such as google maps or similar ones as they include data such as road directions, number of lanes, posted speed, type of road, etc ...
EDIT 2
Please be advised that openstreet map data is incomplete and not available for all jurisdictions outside the US
Google Directions API restricts you to 2500 calls per day. Additionally, terms of service stipulate that you must only use the service "in conjunction with displaying the results on a Google map".
You may be interested in OpenTripPlanner, an in-development project which can do multi-modal routing, and Graphserver on which OpenTripPlanner is built.
One approach would be to use OpenStreetMap data with Graphserver to generate Shortest Path Trees from each node.
As that's 12,502,500 total connections, I'm pretty sure you'll hit some sort of limit if you attempt to use Google maps for all of them. How accurate of results do you need/how far are you travelling?
I might try to generate a crude map with travel speeds on it (e.g. mark off interstates as fast, yadda yadda) then use some software to calculate how long it would take from point to point. One could visualize it as an electromagnetic fields problem, where you're trying to calculate the resistance from point to point over a plane with varying resistance (interstates are wires, lakes are open circuits...).
If you really need all these routes accurately calculated and stored in your database, it sounds like (and I would believe) that you are going to have to spend the money to obtain this. As you can imagine, this is expensive to develop and there should be renumeration.
I would, however, probe a bit about your problem:
Do you really need all 5000! distances in a database? What if you asked google for them as you needed them, and then cached them (if allowed). I've had web applications like this that because of the slow traffic ramp-up pattern, I was able to leverage free services early on to vet the idea.
Do you really need all 5000 points? Or could you pick the top 100 and have a more tractable problem?
Perhaps there is some hybrid where you store distances between big cities and do more estimates for shorter distances.
Again, I really don't know what your problem is, but maybe thinking a bit outside the box will help you find an easier solution.
You might have to go for some heuristics here. Maybe you can estimate travel time based on a few factors like geometric distance and some features about the start and end points (urban vs rural areas, country, ...). You could get a few distances, try to fit your parameters on a subset of them and see how well you're able to predict the other ones. My prediction would be, for example, that travel times approach linear dependence from distance as distance grows larger, in many cases.
I know it's messy, but hey you're trying to estimate 12.5mio datapoints (or whatever the amount :)
You might also be able to incrementally add knowledge from already-retrieved "real" travel times by finding close points to the ones you're looking for:
get closest points StartApprox, EndApprox to starting and end position such that you have a travel time between StartApprox and EndApprox
compute distances StartError, EndError between start and StartApprox, end and EndApprox
if StartError+EndError>Distance(StartApprox, EndApprox) * 0.10 (or whatever your threshold) -> compute distance via API (and store it), else use known travel time plus overhead time based on StartError+EndError
(if you have 100 addresses in NY and 100 in SF, all the values are going to be more or less the same (ie the difference between them is probably lower than the uncertainty involved in these predictions) and such an approach would keep you from issuing 10000 queries where 1 would do)
Many GIS software packages have routing algorithms, if you have the data... Transportation data can be fairly spendy.
There are some other choices of sources for planning routes. Is this something to be done repeatedly, or a one-time process? Can this be broken up into smaller sub-sets of points? Perhaps you can use multiple routing sources and break up the data points into segments small enough for each routing engine.
Here are some other choices from quick Google search:
Wikipedia
Route66
Truck Miles
Related
I want a list of locations (coordinates) for all possible colonies/neighborhoods of some Indian cities. Take for example Delhi. Can this data be obtained with the Places API?
The only thing that comes to my mind is to use a query like -
https://maps.googleapis.com/maps/api/place/search/xml?location=28.540346,77.210026&radius=500&types=administrative_area_level_1|administrative_area_level_2|administrative_area_level_3|locality|neighborhood|street_address|sublocality|sublocality_level_4|sublocality_level_5|sublocality_level_3|sublocality_level_2|sublocality_level_1|subpremise&sensor=false&key=MYKEY
and then keep changing the radius by 500 till the whole city is covered.
Is there a better way of doing this?
Given how often you would need to do this for your map, since caching that data goes against the terms of service, this is not a great approach. If you map gets any decent usage, you'll rapidly hit your quota. Plus you're only get center points of the colonies/neighborhoods. I'd recommend trying to find another source of that data you can download. The Places API was not designed with this in mind.
My client wants some of the functionality of Google maps namely:
- geocoding
- generating maps with points based on postal code or long.lat
- optimal trip mapping
Their issues with Google maps
- cannot control outages
- postal codes are sometimes inaccurate or not updated frequently for Canada/UK
- they have no way to correct inaccurate information
They would prefer to host the mapping application themselves, but will require postal code updates.
Can anyone suggest such a product?
thanks
"cannot control outages - postal codes are sometimes inaccurate or not updated frequently for Canada/UK - they have no way to correct inaccurate information"
Outages
hosting your own mapping is the only way to control this, but you would be very very hard pushed to beat Google Maps / Bing Maps uptime over the last 5 years. Take a look at the following:
OpenStreetMap for the road imagery data, this is open source data very good in the UK (Im not sure about canada) and you can make your own changes and submit them (or just change the data you have downloaded)
Geoserver, Mapnik or MapServer will read openstreetmapdata and create the image tiles needed to create your own maps in whatever style you wish. Depending on if you dont want all countries and all zoom levels these products can create all the tiles you will need in advance, but usually they have to be created in real time and cached. You need a BIG fast server to manage tile crunching
Openlayers or Leaflet are open source javascript mapping platforms that will display your tiles for you
Obviously this is just for road maps, aerial imagery would cost you an absolute fortune.
Post Code Data
Many people do not realize that UK postcode data for latitude and longitude is now completely free and available to download every quarter from the official source (ordinance survey) http://www.ordnancesurvey.co.uk/oswebsite/products/code-point-open/index.html.
This is the same data source Google will use and there is none better but it will always contain inaccuracies and always be a few months out of date.
Finally
Hopefully that answer the question you asked and gives you information to inform your client. Now for the question you didn't ask "Is this approach good value to my client?".
I won't presume to know your business or client, however what I described above is possible but with one to many months of work involved to get it all working together and even then it wont have any where near the performance or uptime of something like google /bing maps and only offers a small subset of their features.
I think you're looking for something like Caliper-It's a very custom, and I would expect expensive, solution. Not suggested.
http://www.caliper.com/GISMappingSoftwareDevelopment.htm
One solution could be to use two different mapping services and compare their results, this way there's a much better chance the data is accurate. You can also fix inaccurate data by creating a system which acts as a barrier between the API and your user, where data you know is inaccurate is corrected before it's displayed. Not sure exactly what you're doing though, so this might not work for you.
Is trip mapping/routing the basic functionality you want to do?
Before rushing into rolling your own, I'd suggest a good think about the consequences of doing so. The first that springs to mind is whilst the pros are that you can now control your data, the cons are that you now control your data.
So you are going to have to consider where and when you get updates and the processes you are going to have to employ to keep your maps in sync with the rest of the world. There are a lot of headaches involved in these things which is why so many people use externally hosted solutions such as Googles.
Is there a way using the Google Maps API to get back an "optimized" route given a set of waypoints (in other words, a "good-enough" solution to the traveling salesman problem), or does it always return the route with the points in the specified order?
There is an option in Google Maps API DirectionsRequest called optimizeWaypoints, which should do what you want. This can only handle up to 8 waypoints, though.
Alternatively, there is an open source (MIT license) library that you can use with the Google Maps API to get an optimal (up to 15 locations) or pretty close to optimal (up to 100 locations) route.
See http://code.google.com/p/google-maps-tsp-solver/
You can see the library in action at www.optimap.net
It always gives them in order.
So I think you'd have to find the distance (or time) between each pair of points, one at a time, then solve the traveling salesman problem yourself. Maybe you could convince Google Maps to add that feature though. I guess what constitutes a "good enough" solution depends on what you're doing and how fast it needs to be.
Google has a ready solution for Travel Salesman Problem. It is OR-Tools (Google's Operations Research tools) that you can find here: https://developers.google.com/optimization/routing/tsp
What you need to do basically is 2 things:
Get the distances between each two points using Google Maps API: https://developers.google.com/maps/documentation/distance-matrix/start
Then you will feed the distances in an array to the OR-Tools and it will find a very-good solution for you (For certain instances with millions of nodes, solutions have been found guaranteed to be within 1% of an optimal tour).
You can also note that:
In addition to finding solutions to the classical Traveling Salesman
Problem, OR-Tools also provides methods for more general types of
TSPs, including the following:
Asymmetric cost problems — The traditional TSP is symmetric: the distance from point A to point B equals the distance from point B to
point A. However, the cost of shipping items from point A to point B
might not equal the cost of shipping them from point B to point A.
OR-Tools can also handle problems that have asymmetric costs.
Prize-collecting TSPs, where benefits accrue from visiting nodes
TSP with time windows
Additional links:
OR-tools at Github: https://github.com/google/or-tools
Get Started: https://developers.google.com/optimization/introduction/get_started
In a typical TSP problem, the assumption is one can travel directly between any two points. For surface roads, this is never the case. When Google calculates a route between two points, it does a heuristic spanning tree optimization, and usually comes up with a fairly close to optimal path.
To calculate a TSP route, one would first have to ask Google to calculate the pair-wise distance between every node in the graph. I think this requires n*(n-1) / 2 calcs. One could then take those distances and perform a TSP optimization on them.
OpenStreetMaps.org has a Java WebStart application which may do what you want. Of course the calculations are being run client side. The project is open source, and may be worth a look.
Are you trying to find an optimal straight line path between locations, or the optimal driving route? If you just want to order the points, if you can get the GPS coordinates, it becomes a very easy problem.
Just found http://gebweb.net/optimap/ It looks nice and easy. Online version using google maps.
If I had lat/long data for all our leads in Salesforce, is there a way to write a query to group them, or say list all the leads within 10 miles of San Francisco, CA ?
[EDIT: Clarification]
I have thousands of leads with both a full address, and long/lats.
I want to build a query on these leads that will give me all of the leads near San Francisco, CA. This means doing GIS type work within salesforce.
I could of course filter specifically on city, or zipcodes or area code, but this presents some problems when trying to rollup a whole metro area.
Yes. You need to Reverse GeoCode them with a tool/service. In the past I have used Maporamas service but it was quite expensive and that was before Google maps and virtual earth existed so I am sure there is something cheaper(free) out there now.... Googling around I have found this and this
EDIT:
OK from What I understand you are trying to calculate the distance between 2 lat/long points. I would start by discounting the ones that where outside you sphere of (lets say) 10 miles. So from your central point you will want to get the the coordinates 10 miles, East, West, South and North. To do this you need to use the Great-circle distance formula.
From that point you have you Sales Force Data if you wish to break this data up further then you need to order the points by distance from the central point. To do this you need to use the Haversine formula
I am not sure what you language preference is so I just included some examples from SQL(mainly) and C#
Haversine Formula in C# and in SQL
Determine the distance between ZIP codes using C#
Great Circle SQL
Great Circle 2
Use GeoHash.org (either as a web service or implement the algorithm). It hashes your lat-long coords into a form that appears similar for nearby places. For example A may have a hash like "akusDf3af" and B might have a hash like "akusDf3b2" if they are nearby. Then do a SOQL query that looks for places starting with the same n characters as a known location. Your n will determine the radius of the lookup.
These are some great technical solutions that can provide very exact answers, but two things to consider:
geospatial proximity does not map neatly to responsibility
Ownership calculation seems to be done best through postal code lookups or other rules that don't allow for gaps or overlaps. Otherwise, you'll have two (or more) salespeople fighting over leads that are close to both of them, and ignore those leads that are far away from both of them.
So, if you're using geo-calculations like those above to assign ownership, just acknowledge the system will leak and create business rules to accomodate for that. But a simple postal lookup to define territories (as salesforce's own territory management feature does) might be better.
I'd suggest the problem we're trying to solve geospatially is not who owns which lead. Rather, given all the leads you own, which are nearby?
maps often offer more data per pixel than columnar reports
Again, geospatial data in a report may not be the best answer. A lead 50km away, but along a major road, is more interesting than another lead 50km away on the other side of a mountain or lake. Or a lead close to other leads is more interesting than a lead by itself.
A report can't show this, but a map can.
Salesforce has some great examples of Google Maps integrations. Instead of a columnar report called "My Nearby Leads", why not a visualforce page, with a google map inside? You're giving the user far more information than a columnar report could. They might like it better, and it's easier to implement than trying to calculate some of the equations above.
Just another perspective that may (or may not) be appropriate to the problem at hand.
This post is really old, but is showing up at the top of Google results, so I figured I would post some info to it anyways.
2 nice mapping tools are batchgeo.com and geocod.io. Geocod.io can even give you lat and long coordinates from an address.
If you just need a one time calculation, you can use Excel. Export all your leads with the lat and long. Then go to Google Maps and get the lat and long in decimal degrees for the city center of wherever you want to measure to.
Then use this formula in excel to calculate the distance between the coordinates in miles. Lat1dd and Long1dd are the coordinates for one point, and the lat2dd and long2dd are coordinate points for the other point.
=3963*ACOS(COS(RADIANS(90-lat1dd))*COS(RADIANS(90-lat2dd))+SIN(RADIANS(90-lat1dd))*SIN(RADIANS(90-lat2dd))*COS(RADIANS(long1dd-long2dd)))
After you run it, just sort the results from smallest to largest to get those results that are the closest.
I haven't done this next part yet, but conceptually it should work. We have a field that lists the major market each account is in. Example, Chicago IL. I am going to build a trigger or formula field that essentially says IF(Market="Chicago IL") then use X and Y for the lat and long. These will be hardcoded as the city center for that specific market. The query will then run each individual account's lat and long against the one from the city center to calculate a distance.
If you wanted to break the market into different zones, you could adjust your formula so it uses < and > on the lat and long fields. Everything less than X but greater than Y goes in Zone A, etc.
Hope this helps someone.
I've always been intrigued by Map Routing, but I've never found any good introductory (or even advanced!) level tutorials on it. Does anybody have any pointers, hints, etc?
Update: I'm primarily looking for pointers as to how a map system is implemented (data structures, algorithms, etc).
Take a look at the open street map project to see how this sort of thing is being tackled in a truely free software project using only user supplied and licensed data and have a wiki containing stuff you might find interesting.
A few years back the guys involved where pretty easy going and answered lots of questions I had so I see no reason why they still aren't a nice bunch.
A* is actually far closer to production mapping algorithms. It requires quite a bit less exploration compared to Dijikstra's original algorithm.
By Map Routing, you mean finding the shortest path along a street network?
Dijkstra shortest-path algorithm is the best known. Wikipedia has not a bad intro: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
There's a Java applet here where you can see it in action: http://www.dgp.toronto.edu/people/JamesStewart/270/9798s/Laffra/DijkstraApplet.html and Google you lead you to source code in just about any language.
Any real implementation for generating driving routes will include quite a bit of data on the street network that describes the costs associate with traversing links and nodes—road network hierarchy, average speed, intersection priority, traffic signal linking, banned turns etc.
Barry Brumitt, one of the engineers of Google maps route finding feature, wrote a post on the topic that may be of interest:
The road to better path-finding
11/06/2007 03:47:00 PM
Instead of learning APIs to each map service provider ( like Gmaps, Ymaps api) Its good to learn Mapstraction
"Mapstraction is a library that provides a common API for various javascript mapping APIs"
I would suggest you go to the URL and learn a general API. There is good amount of How-Tos too.
I've yet to find a good tutorial on routing but there are lots of code to read:
There are GPL routing applications that use Openstreetmap data, e.g. Gosmore which works on Windows (+ mobile) and Linux. There are a number of interesting [applications using the same data, but gosmore has some cool uses e.g. interface with websites.
The biggest problem with routing is bad data, and you never get good enough data. So if you want to try it keep your test very local so you can control the data better.
From a conceptual point of view, imagine dropping a stone into a pond and watching the ripples. The routes would represent the pond and the stone your starting position.
Of course the algorithm would have to search some proportion of n^2 paths as the distance n increases. You would take you starting position and check all available paths from that point. Then recursively call for the points at the end of those paths and so on.
You can increase performance, by not double-backing on a path, by not re-checking the routes at a point if it has already been covered and by giving up on paths that are taking too long.
An alternative way is to use the ant pheromone approach, where ants crawl randomly from a start point and leave a scent trail, which builds up the more ants cross over a given path. If you send (enough) ants from both the start point and the end points then eventually the path with the strongest scent will be the shortest. This is because the shortest path will have been visited more times in a given time period, given that the ants walk at a uniform pace.
EDIT # Spikie
As a further explanation of how to implement the pond algorithm - potential data structures needed are highlighted:
You'll need to store the map as a network. This is simply a set of nodes and edges between them. A set of nodes constitute a route. An edge joins two nodes (possibly both the same node), and has an associated cost such as distance or time to traverse the edge. An edge can either either be bi-directional or uni-directional. Probably simplest to just have uni-directional ones and double up for two way travel between nodes (i.e. one edge from A to B and a different one for B to A).
By way of example imagine three railway stations arranged in an equilateral triangle pointing upwards. There are also a further three stations each halfway between them. Edges join all adjacent stations together, the final diagram will have an inverted triangle sitting inside the larger triangle.
Label nodes starting from bottom left, going left to right and up, as A,B,C,D,E,F (F at the top).
Assume the edges can be traversed in either direction. Each edge has a cost of 1 km.
Ok, so we wish to route from the bottom left A to the top station F. There are many possible routes, including those that double back on themselves, e.g. ABCEBDEF.
We have a routine say, NextNode, that accepts a node and a cost and calls itself for each node it can travel to.
Clearly if we let this routine run it will eventually discover all routes, including ones that are potentially infinite in length (eg ABABABAB etc). We stop this from happening by checking against the cost. Whenever we visit a node that hasn't been visited before, we put both the cost and the node we came from against that node. If a node has been visited before we check against the existing cost and if we're cheaper then we update the node and carry on (recursing). If we're more expensive, then we skip the node. If all nodes are skipped then we exit the routine.
If we hit our target node then we exit the routine too.
This way all viable routes are checked, but crucially only those with the lowest cost. By the end of the process each node will have the lowest cost for getting to that node, including our target node.
To get the route we work backwards from our target node. Since we stored the node we came from along with the cost, we just hop backwards building up the route. For our example we would end up with something like:
Node A - (Total) Cost 0 - From Node None
Node B - Cost 1 - From Node A
Node C - Cost 2 - From Node B
Node D - Cost 1 - From Node A
Node E - Cost 2 - From Node D / Cost 2 - From Node B (this is an exception as there is equal cost)
Node F - Cost 2 - From Node D
So the shortest route is ADF.
From my experience of working in this field, A* does the job very well. It is (as mentioned above) faster than Dijkstra's algorithm, but is still simple enough for an ordinarily competent programmer to implement and understand.
Building the route network is the hardest part, but that can be broken down into a series of simple steps: get all the roads; sort the points into order; make groups of identical points on different roads into intersections (nodes); add arcs in both directions where nodes connect (or in one direction only for a one-way road).
The A* algorithm itself is well documented on Wikipedia. The key place to optimise is the selection of the best node from the open list, for which you need a high-performance priority queue. If you're using C++ you can use the STL priority_queue adapter.
Customising the algorithm to route over different parts of the network (e.g., pedestrian, car, public transport, etc.) of favour speed, distance or other criteria is quite easy. You do that by writing filters to control which route segments are available, when building the network, and which weight is assigned to each one.
Another thought occurs to me regarding the cost of each traversal, but would increase the time and processing power required to compute.
Example: There are 3 ways I can take (where I live) to go from point A to B, according to the GoogleMaps. Garmin units offer each of these 3 paths in the Quickest route calculation. After traversing each of these routes many times and averaging (obviously there will be errors depending on the time of day, amount of caffeine etc.), I feel the algorithms could take into account the number of bends in the road for high level of accuracy, e.g. straight road of 1 mile will be quicker than a 1 mile road with sharp bends in it.
Not a practical suggestion but certainly one I use to improve the result set of my daily commute.