How to get raw traffic flow data - google-maps

Creating application to use car count data from traffic. How do I access the live google maps, waze or bing maps data?

The Azure Maps platform has raw traffic data you can access for analysis purposes (I have been working with several others who are doing this).
There are two approaches. The first is to use the Traffic flow segment API: https://learn.microsoft.com/en-us/rest/api/maps/traffic/gettrafficflowsegment
The second is to use the traffic flow tiles. https://learn.microsoft.com/en-us/rest/api/maps/traffic/gettrafficflowtile If you need traffic data over a large area, this is the lost cost approach, although more complex. Azure Maps provides the traffic flow tiles in both image (raster) like most platforms, but also in vector tile format (aligns with the open vector tile standard created by Mapbox). You can download all the tiles over your area of interest and extract the data for analysis. If you request traffic relative to the speed limit, each line segment in the tile will have a value between 0 and 1 which indicates the speed relative to the speed limit. For example, 0.2 would mean traffic is flowing at 20% of the speed limit.

Related

Is there any limit on the size of the CAD file, say RVT file that Forge Viewer can display?

We have large Revit files ranging from 10MB to 200MB. The forge viewer seems to take time but load them. The problem arises with our own algorithms which get properties inside the viewable once it is loaded in Forge Viewer. I was wondering if Forge Viewer itself has any limit on the size of the metadata or CAD files which it could display. Also, the getProperties call on the viewer object has any limitations on bigger metadata?
Neither the Viewer nor our Derivative Service (model translation) themselves has any hard limits on the physical size/complexity of the models being processed.
The real boundaries however lie with the browser - different browser on different devices have various memory limits for each tab and they will crash if the model gets too large, and the processing efficiency of the Derivative Service may vary depending on the platform workloads, causing translation jobs to get timed out if the model is too large and complex (which is an ongoing issue that our Engineering is looking to mitigate).

Google Compute Engine auto scaling based on queue length

We host our infrastructure on Google Compute Engine and are looking into Autoscaling for groups of instances. We do a lot of batch processing of binary data from a queue. In our case, this means:
When a worker is processing data the CPU is always 100%
When the queue is empty we want to terminate all workers
Depending on the length of the queue we want a certain amount of workers
However I'm finding it hard to figure out a way to auto-scale this on Google Compute Engine because they appear to scale on instance-only metrics such as CPU. From the documentation:
Not all custom metrics can be used by the autoscaler. To choose a
valid custom metric, the metric must have all of the following
properties:
The metric must be a per-instance metric.
The metric must be a valid utilization metric, which means that data from the metric can be used to proportionally scale up or down
the number of virtual machines.
If I'm reading the documentation properly this makes it hard to use the auto scaling on a global queue length?
Backup solutions
Write a simple auto-scale handler using the Google Cloud API to create or destroy new workers using Instances API
Write a simple auto-scale handler using instance groups and then manually insert/remove instances using the InstanceGroups: insert
Write a simple auto-scaling handler using InstangeGroupManagers: resize
Create a custom per-instance metric which measures len(queue)/len(workers) on all workers
As of February 2018 (Beta) this is possible via "Per-group metrics" in stackdriver.
Per-group metrics allow autoscaling with a standard or custom metric
that does not export per-instance utilization data. Instead, the group
scales based on a value that applies to the whole group and
corresponds to how much work is available for the group or how busy
the group is. The group scales based on the fluctuation of that group
metric value and the configuration that you define.
More information at https://cloud.google.com/compute/docs/autoscaler/scaling-stackdriver-monitoring-metrics#per_group_metrics
The how-to is too long to post here.
As far as I understand this is not implemented yet (as at January 2016). At the moment autoscaling is only targeted at web serving scenarios, where you want to serve web pages/other web services from your machines and keep some reasonable headroom (e.g. in terms of CPU or other metrics) for spikes in traffic. Then the system will adjust the number of instances/VMs to match your target.
You are looking for autoscaling for batch processing scenarios, and this is not catered for at the moment.

Graphhopper - Travel Times Between All 30,000 Visible Zip Codes?

I'd like to calculate the matrix of travel times between US zipcodes. There are about 30k visible zipcodes, so this is 900 million calculations (or 450 million assuming travel time is the same in both directions).
I haven't used graphhopper before but it seems suited to the task. My question are:
What's the best way of doing it?
Will this overload the graphhopper servers?
How long will it take?
I can supply latitude and longitude for each pair of zip codes.
Thanks - Steve
I've not tested GraphHopper yet for these large amount of points, but it should be possible.
What's the best way of doing it?
It would be probably faster if you avoid the HTTP overhead and use the Java lib directly like in this example. Be sure to assign enough RAM as the matrix itself is already 2g if you only use a short value for the distance or time. See also this question.
Will this overload the graphhopper servers?
The API is not allowed to be used without an API key which you can grab here. Or set up your own GraphHopper server.
How long will it take?
Will take probably some days though.
Warning - enterprisy note: we provide support to setup your servers or for your usecase. And also we sell a matrix add-on which makes those calculations at least 10 times faster.

Mongodb geospatial index vs GoogleMaps Directions Service

I recently worked on a small project on location-based services and my intention was to locate the nearest cab (GPS fitted) within a given radius of a requesting passenger (GPS enabled Android phone). I wanted to use MongoDB's geospatial indexes, but it turned out that geospatial indexes work on lat-longs and they calculate displacement between two points, not the distance. In my case, search was confined within a city, and I had to go for GoogleMaps Directions Service because it tells the distance as on the road, estimated time taken etc.
Does this mean that geospatial indexes make sense only when displacement is large enough, so that distance and displacement becomes essentially the same?
Geospatial indexes have the goal of having fast data retrieval based on position on a multi-dimensional space. If you have the cab position data in a MongoDB database you could use a geospatial index to fastly select a reduced set of cabs which are more likely to be the closest one, but still you'd have to calculate the distance on the road (and eventually the drive time) using an algorythm on the road network.
For example you know that if the closest (in straight line) cab is at 20km from you (measured through the road), you know that any cab outside the 20km radius will surely be further away (on the road) than the first one you found, so you're not interested in them.
You can then use MongoDB spatial index to get all the cabs in 20km radius and then you can find among them which one has the minimum distance.

Bing Maps API - SQL - geometry vs geography type

I'm developing a Mapping Service with Bing Maps AJAX API and SQL Server 2008. The question which appears to me is should I use the geography or geometry data type. I researched a lot but doesn't found a satisfactory answer. Here are some links about the topic:
SQL 2008 geography & geometry - which to use?
http://www.mssqltips.com/tip.asp?tip=1847
https://alastaira.wordpress.com/2011/01/23/the-google-maps-bing-maps-spherical-mercator-projection/
If I compare the two types I see the following points.
pro geography
consistent distance calculation around the world (time line!)
the coordinate system of the database is the same as the one which is used to add data to a map with the Bing Maps API (WGS84)
precise
contra geography
high computational costs
data size constrained to one hemisphere
missing functions (STConvexHull(), STRelate(),...)
pro geometry
faster computation
unconstrained data size
contra geography
distance units in degree (if we use WGS84 coordinates)
The problem for me is that I don't need a fast framework, a great coverage (the whole world) and high functionality. So I would prefer the geometry type.
The problem with the geometry type is, that I have to transform my data into a flat projection (Bing Map use SRID=3875), so that I get meters for the calculation. But when I use the Bing Maps projection (3875) in the database I have to transform my data back to WGS84 if I won't to display it in the map.
You've provided quite a good summary of the differences between the two types, and you've correctly identified the two sensible alternatives to be either geography(4326) or geometry(3857), so I'm not quite sure what more information anyone can provide - you just need to make the decision yourself based on the information available to you.
I would say that, although the geometry datatype is likely to be slightly quicker than the geography datatype (since it relies on simpler planar calculations, and can benefit from a tight bounding box over the area in question), this increase in performance will be more than offset by the fact that you'll then have to unproject back to WGS84 lat/long in order to pass back to Bing Maps - reprojection is an expensive process.
You could of course store WGS84 angular coordinates using the geometry datatype, but this is really a hack and not recommended - you are almost certain to run into difficulties further down the line.
So, I'd recommend using the geography datatype and WGS84. With careful index tuning, you should still be able to get sub-second response time for most queries of even large datasets. Incidentally, the "within a hemisphere" rule is lifted for the geography datatype in SQL Denali, so that limitation goes away if you were to upgrade.