OpenStreetMap: building a weighted graph - gis

I would like to create a fairly big (~1bln nodes) weighted graph, with nodes being locations and edges being the roads that are present in OpenStreetMap data. Let us say we want to focus on some country, to keep the size within the above limit. The weights of the edges can be the actual lengths of the roads they represent. What would you do? Should I write my own parser for XML data and construct it in a straightforward way?

You may find gis.stackoverflow.com useful. The keywords are PostGIS and pgRouting. See e.g. https://gis.stackexchange.com/questions/21680/collecting-street-data-to-populate-a-graph-stucture-for-routing/21682#21682
and similar questions.

Related

Given a user's lat lng, how to find the nearest lat lng from a database of thousands of lat lng?

I have data of locations of thousands of sensors in MySQL. I want to identify the sensor closest to the user's location and show that specific sensor's data. All the location data is available as lat lng.
I understand that one approach can be to find displacements between the origin and all the sensors using Haversine formula and select the one with the shortest distance. The problem here is that there are tens of thousands of sensors.
Any suggestions/leads?
Spatial index allows efficient query of points within any specific distance. The problem of course is one might not know the search radius needed in specific case. Unfortunately, a large radius causes inefficient queries, and a small radius might result in no match at all.
A possible solution is to search with increasing radius, until the search returns some results, and then find the closest result among those.
This article describes this solution for BigQuery, would require some adaptation for MySQL script dialect:
https://mentin.medium.com/nearest-neighbor-using-bq-scripting-373241f5b2f5
Not the MySQL answer you are looking for but Postgresql's popular PostGIS extension has an inbuilt K Nearest Neighbor operator class). Also, see its documentation. It works great!
Also, I am aware of this Go library that allows you to do KNN in memory after building a Quadtree with your sensor locations.
For only thousands, a simple bounding box with two 2-column indexes may be fast enough.
For better speed, see SPATIAL indexing.
For details on those two solutions, plus two faster ones, see Find Nearest

Tabelau geographical heatmap and overlaying data from two different datasets

Now I'm trying to overlay latitude and longitudinal points from two different data sets, resulting in two different markers for each one on a geographical map. There is no relationship between the two data sets. Tableau doesn't seem to be able to accomplish this directly. I don't want to group the data at all, just plot the lat and lon points. Any suggestions?
I would also like one of the datasets above to be a heatmap, i.e. each data point plotted has its intensity correlated to the dimension. Besides the overlaying problem above, accomplishing a geographical heat map alone is not working for me. My geographical heat map by latitude and longitudinal points is not conveying the information I want. The lighter color marks are on top of the darker color marks. However, I want the darker color marks to be in front. How do I achieve this?
Would Google Maps or Fusion Tables be a better option for me?
I solved this problem by combining the two datasets into one using UNION ALL, making sure to add an extra column with a descriptor designating the originating source (i.e., which of the two datasets). I then set up a calculated field to determine what data to visualize and how to visualize it.
As regards to the heatmap, the solution was to bin the data and then sort it when plotting. Without binning, sorting appears not possible using Tableau.
I took a look at Google Maps and Fusion Tables, but the functionality and visual aesthetics are currently not at the Tableau level.

Find the nearest geo positions

I am looking for a way to get the nearly geo positions from one geo position. I can calculate the difference from two position, but I need to find all geo positions from a point with a radius of 10-20 miles. I find a similaire on flickr:
http://m.flickr.com/#/nearby/
Anybody an idear how it works? They must convert a latitude and longitude to a unique value and must find all entries nearly this position or something else.
Thanks for help!
You might use Voronoi Diagrams, but probably pre-sorting your data by each coordinate (separately) and then finding an intersection of point sets which lay nearby for each of coordinates would solve your problem easier.
A point location data structure can be built on top of the Voronoi diagram in order to answer nearest neighbor queries, where one wants to find the object that is closest to a given query point. Nearest neighbor queries have numerous applications.
You can use a kd-Tree. Some time ago I tried this one and it worked quite well:
https://github.com/jmhodges/kdtree2
Use a (point-)quad tree, or k-d tree, or if the number of points is not high, you even could use a brute force search.
Do not use voronoi diagrams. They are one of the most complex algos to implement.

How to very efficiently assign lat/long to city boundary described by shape?

I have a huge shapefile of 36.000 non-overlapping polygones (city boundaries). I want to easily determine the polygone into which a given lat/long falls. What would the best way given that it must be extremely computationaly efficient ?
I was thinking of creating a lookup table (tilex,tiley,polygone_id) where tilex and tiley are tile identifiers at zoom levels 21 or 22. Yes, the lack of precision of using tile numbers and a planar projection is acceptable in my application.
I would rather not use postgres's GIS extension and am fine with a program that will run for 2 days to generate all the INSERT statements.
Insert statements into what? Are you using a different spatial database or some other database? If you are willing to use python, C, or Java you could use shapely, GEOS, or JTS to write some custom code to do what you want rather simply.
In python use this lib to open the shapefile
http://indiemaps.com/blog/2008/03/easy-shapefile-loading-in-python/
then shapely
http://gispython.org/shapely/docs/1.0/manual.html#contains
to test containment
For Java use Geotools which also includes JTS.
Sounds like you want a BSP tree. Basically you divide the area into smaller and smaller polygons in a tree like fashion.
The advantage is that you don't need to compare coordinates with every polygon later on. That makes it a very fast way to find the correct polygon.

Calculate longitude/latitude

Given the following input:
known longitudes/latitudes of 1..n locations
known distance between locations 1..n and another location "m"
How can I calculate the longitude/latitude of the location "m"?
This sounds like a basic latitude-longitude triangulation question. The common approaches are outlined in a Yahoo! Answers topic here. There are likely libraries to do this in many languages. A google search for "latitude longitude triangulation" plus your language of choice will likely reveal some existing code to use. "Geocoding" is another common task rolled into similar libraries, so that may be another useful keyword.
Edit: As others have mentioned, "trilateration" seems to be the best term. However, depending on your data and requirements, there are simpler approximation solutions that may satisfy your requirements.
The Yahoo! Answers post is quoted below for convenience:
"For larger distances, spherical
geometry. For relatively small ones,
treat the earth as flat, and the
coordinates as xy coordinates. For the
distances to work with the degrees of
the coordinates, you will have to use
the cosine function to convert from
one to the other. (While degrees of
latitude are about 69 miles all over
the earth, degrees of longitude vary
from the same at the equator to 0 at
the poles.)
You have the center points of three
circles and the radius of those
circles. They are supposed to
intersect at one point, so you can
treat them in pairs to find the
intersection points of each and throw
out the ones that don't match
http://mathworld.wolfram.com/Circle-CircleIntersection.html."
(mike1942f)
Trilateration is what you want. This only requires 3 of your reference points, however the rest can be used to increase accuracy if you want to get really clever.
The trickiest part is working with long/lat as opposed to Cartesian coordinates, especially as the earth is not a perfect sphere.
This is a trilateration problem. In your case, you have multiple points of reference, so you can minimize the sum of squared-errors between the given distances and those corresponding to the optimal position of m.