I've always been intrigued by Map Routing, but I've never found any good introductory (or even advanced!) level tutorials on it. Does anybody have any pointers, hints, etc?
Update: I'm primarily looking for pointers as to how a map system is implemented (data structures, algorithms, etc).
Take a look at the open street map project to see how this sort of thing is being tackled in a truely free software project using only user supplied and licensed data and have a wiki containing stuff you might find interesting.
A few years back the guys involved where pretty easy going and answered lots of questions I had so I see no reason why they still aren't a nice bunch.
A* is actually far closer to production mapping algorithms. It requires quite a bit less exploration compared to Dijikstra's original algorithm.
By Map Routing, you mean finding the shortest path along a street network?
Dijkstra shortest-path algorithm is the best known. Wikipedia has not a bad intro: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
There's a Java applet here where you can see it in action: http://www.dgp.toronto.edu/people/JamesStewart/270/9798s/Laffra/DijkstraApplet.html and Google you lead you to source code in just about any language.
Any real implementation for generating driving routes will include quite a bit of data on the street network that describes the costs associate with traversing links and nodes—road network hierarchy, average speed, intersection priority, traffic signal linking, banned turns etc.
Barry Brumitt, one of the engineers of Google maps route finding feature, wrote a post on the topic that may be of interest:
The road to better path-finding
11/06/2007 03:47:00 PM
Instead of learning APIs to each map service provider ( like Gmaps, Ymaps api) Its good to learn Mapstraction
"Mapstraction is a library that provides a common API for various javascript mapping APIs"
I would suggest you go to the URL and learn a general API. There is good amount of How-Tos too.
I've yet to find a good tutorial on routing but there are lots of code to read:
There are GPL routing applications that use Openstreetmap data, e.g. Gosmore which works on Windows (+ mobile) and Linux. There are a number of interesting [applications using the same data, but gosmore has some cool uses e.g. interface with websites.
The biggest problem with routing is bad data, and you never get good enough data. So if you want to try it keep your test very local so you can control the data better.
From a conceptual point of view, imagine dropping a stone into a pond and watching the ripples. The routes would represent the pond and the stone your starting position.
Of course the algorithm would have to search some proportion of n^2 paths as the distance n increases. You would take you starting position and check all available paths from that point. Then recursively call for the points at the end of those paths and so on.
You can increase performance, by not double-backing on a path, by not re-checking the routes at a point if it has already been covered and by giving up on paths that are taking too long.
An alternative way is to use the ant pheromone approach, where ants crawl randomly from a start point and leave a scent trail, which builds up the more ants cross over a given path. If you send (enough) ants from both the start point and the end points then eventually the path with the strongest scent will be the shortest. This is because the shortest path will have been visited more times in a given time period, given that the ants walk at a uniform pace.
EDIT # Spikie
As a further explanation of how to implement the pond algorithm - potential data structures needed are highlighted:
You'll need to store the map as a network. This is simply a set of nodes and edges between them. A set of nodes constitute a route. An edge joins two nodes (possibly both the same node), and has an associated cost such as distance or time to traverse the edge. An edge can either either be bi-directional or uni-directional. Probably simplest to just have uni-directional ones and double up for two way travel between nodes (i.e. one edge from A to B and a different one for B to A).
By way of example imagine three railway stations arranged in an equilateral triangle pointing upwards. There are also a further three stations each halfway between them. Edges join all adjacent stations together, the final diagram will have an inverted triangle sitting inside the larger triangle.
Label nodes starting from bottom left, going left to right and up, as A,B,C,D,E,F (F at the top).
Assume the edges can be traversed in either direction. Each edge has a cost of 1 km.
Ok, so we wish to route from the bottom left A to the top station F. There are many possible routes, including those that double back on themselves, e.g. ABCEBDEF.
We have a routine say, NextNode, that accepts a node and a cost and calls itself for each node it can travel to.
Clearly if we let this routine run it will eventually discover all routes, including ones that are potentially infinite in length (eg ABABABAB etc). We stop this from happening by checking against the cost. Whenever we visit a node that hasn't been visited before, we put both the cost and the node we came from against that node. If a node has been visited before we check against the existing cost and if we're cheaper then we update the node and carry on (recursing). If we're more expensive, then we skip the node. If all nodes are skipped then we exit the routine.
If we hit our target node then we exit the routine too.
This way all viable routes are checked, but crucially only those with the lowest cost. By the end of the process each node will have the lowest cost for getting to that node, including our target node.
To get the route we work backwards from our target node. Since we stored the node we came from along with the cost, we just hop backwards building up the route. For our example we would end up with something like:
Node A - (Total) Cost 0 - From Node None
Node B - Cost 1 - From Node A
Node C - Cost 2 - From Node B
Node D - Cost 1 - From Node A
Node E - Cost 2 - From Node D / Cost 2 - From Node B (this is an exception as there is equal cost)
Node F - Cost 2 - From Node D
So the shortest route is ADF.
From my experience of working in this field, A* does the job very well. It is (as mentioned above) faster than Dijkstra's algorithm, but is still simple enough for an ordinarily competent programmer to implement and understand.
Building the route network is the hardest part, but that can be broken down into a series of simple steps: get all the roads; sort the points into order; make groups of identical points on different roads into intersections (nodes); add arcs in both directions where nodes connect (or in one direction only for a one-way road).
The A* algorithm itself is well documented on Wikipedia. The key place to optimise is the selection of the best node from the open list, for which you need a high-performance priority queue. If you're using C++ you can use the STL priority_queue adapter.
Customising the algorithm to route over different parts of the network (e.g., pedestrian, car, public transport, etc.) of favour speed, distance or other criteria is quite easy. You do that by writing filters to control which route segments are available, when building the network, and which weight is assigned to each one.
Another thought occurs to me regarding the cost of each traversal, but would increase the time and processing power required to compute.
Example: There are 3 ways I can take (where I live) to go from point A to B, according to the GoogleMaps. Garmin units offer each of these 3 paths in the Quickest route calculation. After traversing each of these routes many times and averaging (obviously there will be errors depending on the time of day, amount of caffeine etc.), I feel the algorithms could take into account the number of bends in the road for high level of accuracy, e.g. straight road of 1 mile will be quicker than a 1 mile road with sharp bends in it.
Not a practical suggestion but certainly one I use to improve the result set of my daily commute.
Related
I am developing an application that can show the shortest route using public transport methods (currently only buses). It should include the sections where one can walk some distance to the next stop rather than taking another bus (if its more shorter).
What should be the data structure for the map? I thought of graphical structure with nodes for bus stops. and vertices with distance as weight.
Even if I have found the shortest path using an algorithm (dijkstra) how to implement that walking sections in to the logic.
Without a lot of extra information, it's difficult to give you a great answer to this question, but let me hit some basics. This should be enough to get you going, but then you're going to need to do additional work to develop your solution.
In general, your data structure is going to be something like nodes that represent destinations or waypoints (like a bus stop, or an address). Your relationships will be modes of transportation with associated costs. For example, you can get from point/node A to point/node B via walking, or the bus. Those are two different relationships, with different "costs" in terms of time and money.
In general, you'll want to use a "weighted shortest path" algorithm to find the best way from point A to point B. Neo4j gives you a shortest path function, but in your case you'll need to assign weights to your relationships, and then calculate the shortest path not based on the number of "hops" through the graph, but based on some overall cost metric (time, money, whatever).
Ian Robinson wrote a great post on how to do weighted shortest paths in neo4j. So you should follow a template like that as a starting point.
You have a bunch of design questions to answer though. Do you want the shortest path in terms of time, money, effort, or some combination? The answer to that will affect your graph design, and your query strategy.
I'm working on a transportation model, and am about to do a travel time matrix between 5,000 points. Is there a free, semi-reliable way to calculate the travel times between all my nodes?
I think google maps has a limit on the number of queries / hits I can achieve.
EDIT
I'd like to use an api such as google maps or similar ones as they include data such as road directions, number of lanes, posted speed, type of road, etc ...
EDIT 2
Please be advised that openstreet map data is incomplete and not available for all jurisdictions outside the US
Google Directions API restricts you to 2500 calls per day. Additionally, terms of service stipulate that you must only use the service "in conjunction with displaying the results on a Google map".
You may be interested in OpenTripPlanner, an in-development project which can do multi-modal routing, and Graphserver on which OpenTripPlanner is built.
One approach would be to use OpenStreetMap data with Graphserver to generate Shortest Path Trees from each node.
As that's 12,502,500 total connections, I'm pretty sure you'll hit some sort of limit if you attempt to use Google maps for all of them. How accurate of results do you need/how far are you travelling?
I might try to generate a crude map with travel speeds on it (e.g. mark off interstates as fast, yadda yadda) then use some software to calculate how long it would take from point to point. One could visualize it as an electromagnetic fields problem, where you're trying to calculate the resistance from point to point over a plane with varying resistance (interstates are wires, lakes are open circuits...).
If you really need all these routes accurately calculated and stored in your database, it sounds like (and I would believe) that you are going to have to spend the money to obtain this. As you can imagine, this is expensive to develop and there should be renumeration.
I would, however, probe a bit about your problem:
Do you really need all 5000! distances in a database? What if you asked google for them as you needed them, and then cached them (if allowed). I've had web applications like this that because of the slow traffic ramp-up pattern, I was able to leverage free services early on to vet the idea.
Do you really need all 5000 points? Or could you pick the top 100 and have a more tractable problem?
Perhaps there is some hybrid where you store distances between big cities and do more estimates for shorter distances.
Again, I really don't know what your problem is, but maybe thinking a bit outside the box will help you find an easier solution.
You might have to go for some heuristics here. Maybe you can estimate travel time based on a few factors like geometric distance and some features about the start and end points (urban vs rural areas, country, ...). You could get a few distances, try to fit your parameters on a subset of them and see how well you're able to predict the other ones. My prediction would be, for example, that travel times approach linear dependence from distance as distance grows larger, in many cases.
I know it's messy, but hey you're trying to estimate 12.5mio datapoints (or whatever the amount :)
You might also be able to incrementally add knowledge from already-retrieved "real" travel times by finding close points to the ones you're looking for:
get closest points StartApprox, EndApprox to starting and end position such that you have a travel time between StartApprox and EndApprox
compute distances StartError, EndError between start and StartApprox, end and EndApprox
if StartError+EndError>Distance(StartApprox, EndApprox) * 0.10 (or whatever your threshold) -> compute distance via API (and store it), else use known travel time plus overhead time based on StartError+EndError
(if you have 100 addresses in NY and 100 in SF, all the values are going to be more or less the same (ie the difference between them is probably lower than the uncertainty involved in these predictions) and such an approach would keep you from issuing 10000 queries where 1 would do)
Many GIS software packages have routing algorithms, if you have the data... Transportation data can be fairly spendy.
There are some other choices of sources for planning routes. Is this something to be done repeatedly, or a one-time process? Can this be broken up into smaller sub-sets of points? Perhaps you can use multiple routing sources and break up the data points into segments small enough for each routing engine.
Here are some other choices from quick Google search:
Wikipedia
Route66
Truck Miles
Here let me clarify , I have no intentions to peep in to or any evil intention towards tfls database and other related information.
But , ofcourse Millions of users are greatly beniftted the way it serves the information.
http://journeyplanner.tfl.gov.uk/
So , If we want to create some site like tfl, journeyplanner , what are the basic things we need to keep in mind.
Which Architecture We should use?
Can We create this website using ASP.NET(Should be able to)?
Is TFL integrating it's website with google maps or any other GPS
Edit:
While you enter the Zip/Pin code or Station name , it creates a map automatically from source to destination and calcculates the distance also.
My Question here is , How do they calculate the distance , do they keep help of Maps or GPS or they created there own webservic?
To answer the points, in order:
Which Architecture We should use?
One you know and understand there is more than one approach that would be possible to do a similar thing with.
Can We create this website using ASP.NET(Should be able to)?
You could. Similarly, you could do it as a Java Servlet or PHP application. If you were feeling particularly warped, you could probably make something work in pure Javascript (but your clients might hate you)
Is TFL integrating it's website with google maps or any other GPS
They're more likely using Ordnance Survey data, that they've rendered their own maps from (certainly if you pan right out, coverage runs out quite quickly).
From a routing perspective, they're probably using something like Dijkstra's Algorithm, although it's probably very optimised to cope with timetabling.
There are numerous algorithms for routing, which boil down to "relative cost" (where that cost may be distance, time, financial, or a combination). Not taking into consideration timetables, you can precalculate the costs between connected nodes (e.g. Liverpool St -> Bank via Central Line is ~5 mins), this would give a baseline for something like Dijkstra although you'd still need to factor in the cost of interchaging between modes of transport and waiting for connections to arrive, etc.).
You might want to look into routing algorithms in general (there's even info over on OpenStreetMap's wiki) before looking into the complexities introduced with timetabled services.
I've got a list of points, and a route that an external provider has generated through those points.
I would like to generate a route using those same point with my own road network.
Then I want to be able to detect if there is any significant difference between the two routes.
One suggestion is that for the 2 routes, we find out what road segments they travel across, and compare the list of road segments?
Is this a valid approach?
How do we go about getting the list of road segments given a route?
I am using ArcGis server 9.3 with Java 5 and Oracle 10g. I am using the ST functions and NetworkAnalyst via the java api.
Thanks.
Calculate the route using your points and road network. Then buffer the resulting route into a polygon (the buffer radius should be your "tolerance"). Then clip the external route using your polygon. If the resulting polyline is non-empty, then there is a deviation outside of your tolerance.
This method does not acount for any "significant" deviations such as backtracking, U-Turns, or taking a nearby parallel road.
Alternatively, you can compare the resulting "directions" and check for deviations there--particuarly using street names. This saves you from checking every road segment. If you have any deviations in road names, then check the individual road segments of each section.
I've just implemented something similar in my application. I have a list of lat/long coordinates from a GPS device and needed to create a route based on this data.
I started by matching each GPS position with a node in my street network. I then removed 'consecutively duplicate' nodes to filter out those consecutive positions that are at the same node. Then, I started 'walking' through my street network, starting at the first node. I checked the first node and second node and checked for a common street segment. If I found one, great. If not, I create a shortest path between the 2 nodes and use those roads instead. I continue doing this until I've examined all the nodes. At the end of this process, I have a list of road segments that the vehicle traveled and the order in which they were traveled, too.
Unfortunately, I'm using a different map, different programming language, and different database. As such, sharing the code won't be helpful to you at all. Hopefully the process I described above will be enough help for you to accomplish your task.
Is there a way using the Google Maps API to get back an "optimized" route given a set of waypoints (in other words, a "good-enough" solution to the traveling salesman problem), or does it always return the route with the points in the specified order?
There is an option in Google Maps API DirectionsRequest called optimizeWaypoints, which should do what you want. This can only handle up to 8 waypoints, though.
Alternatively, there is an open source (MIT license) library that you can use with the Google Maps API to get an optimal (up to 15 locations) or pretty close to optimal (up to 100 locations) route.
See http://code.google.com/p/google-maps-tsp-solver/
You can see the library in action at www.optimap.net
It always gives them in order.
So I think you'd have to find the distance (or time) between each pair of points, one at a time, then solve the traveling salesman problem yourself. Maybe you could convince Google Maps to add that feature though. I guess what constitutes a "good enough" solution depends on what you're doing and how fast it needs to be.
Google has a ready solution for Travel Salesman Problem. It is OR-Tools (Google's Operations Research tools) that you can find here: https://developers.google.com/optimization/routing/tsp
What you need to do basically is 2 things:
Get the distances between each two points using Google Maps API: https://developers.google.com/maps/documentation/distance-matrix/start
Then you will feed the distances in an array to the OR-Tools and it will find a very-good solution for you (For certain instances with millions of nodes, solutions have been found guaranteed to be within 1% of an optimal tour).
You can also note that:
In addition to finding solutions to the classical Traveling Salesman
Problem, OR-Tools also provides methods for more general types of
TSPs, including the following:
Asymmetric cost problems — The traditional TSP is symmetric: the distance from point A to point B equals the distance from point B to
point A. However, the cost of shipping items from point A to point B
might not equal the cost of shipping them from point B to point A.
OR-Tools can also handle problems that have asymmetric costs.
Prize-collecting TSPs, where benefits accrue from visiting nodes
TSP with time windows
Additional links:
OR-tools at Github: https://github.com/google/or-tools
Get Started: https://developers.google.com/optimization/introduction/get_started
In a typical TSP problem, the assumption is one can travel directly between any two points. For surface roads, this is never the case. When Google calculates a route between two points, it does a heuristic spanning tree optimization, and usually comes up with a fairly close to optimal path.
To calculate a TSP route, one would first have to ask Google to calculate the pair-wise distance between every node in the graph. I think this requires n*(n-1) / 2 calcs. One could then take those distances and perform a TSP optimization on them.
OpenStreetMaps.org has a Java WebStart application which may do what you want. Of course the calculations are being run client side. The project is open source, and may be worth a look.
Are you trying to find an optimal straight line path between locations, or the optimal driving route? If you just want to order the points, if you can get the GPS coordinates, it becomes a very easy problem.
Just found http://gebweb.net/optimap/ It looks nice and easy. Online version using google maps.