Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
What's a good library for geospatial functions? I mean things like
distance between two points on the globe
coordinates of a circle of a given radius from a particular point
etc.
Bonus if there's an interface to the various ways different databases
represent geolocations.
I'm a geo-noob (in case this question didn't make it obvious), so pointers to other geolocation/geospatial resources are welcome.
C++ and Python preferred, but all pointers welcome.
I've enjoyed using geopy. It's a simple library that finds great-circle distance in a number of projections. Geopy also provides a single interface to multiple geocoders like Google Maps and Microsoft Earth to give you coordinates for a street address.
You might be interested in the Topic :: Scientific/Engineering :: GIS section in PyPi.
Some options for functions from a useful article on the O'Reilly website. There are other options in the article.
GEOS open source C++ geometry /
topology engine. Might suit you?
PostGIS - a PostgreSQL database
that can also store geometric
(spatial) data types. This provides
GIS-like abilities within an SQL
database environment, so you could do manipulations through SQL.
I'm not sure about interfaces to different databases but the article mentions a number of libraries that convert geospatial data between different formats. You might also be interested in the OGC standards. If all your target databases support WFS you should be able to access them with exactly the same code. EDIT: Increasing numbers of GIS packages support WFS but I don't think pure databases do.
EDIT: you could also check out OSGeo which is a collaboration to support open source geospatial stuff. FDO might do the interfaces to different databases.
If you were using ruby with or without rails, I'd recommend the GeoKit gem: http://geokit.rubyforge.org/
Check GDAL/OGR which is two libraries in one - GDAL for raster data and OGR for vector data. The OGR part provides API to manipulate geometries - it uses GEOS library.
I was looking at something similar, there's the Geolocator Library that I found, it provides some distance calculation functions.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Is there any open source alternative for Zonal Statistics tool (ArcGIS Spatial Analyst)?
What is the best tool (which I can use in script) dor counting statistics of raster files?
You can do this with GRASS using various methods. Which one is most suitable will depend on your data and the required output. Note that you can use GRASS also from within QGIS using the GRASS toolbox or Sextante toolbox.
Let's assume you have:
a vector map, e.g., vector_zones with the zones defined in the
column myzones in the attribute table.
a raster layer 'values' for which you want to calculate your zonal statistics
r.statistics
To use r.statistics, you first need to convert the vector map to a raster layer, which you can do with v.to.rast. Next, use r.statistics to calculate the zonal statistics.
v.to.rast input=vector_zones output=zones column=myzones
r.statistics base=zones cover=values out=outputmap method=average
This will give you a new layer with the selected zonal statistic, which could be average, mode, median, variance, etc. (see the man page link above).
r.univar
The r.univar function also works on raster layers.
v.to.rast input=vector_zones output=zones column=myzones
r.univar map=values zones=zones output=output.file fs=;
The output is a table with the zonal statistics.
v.rast.stats
This does not require you to convert the vector layer to a raster layer (this is done internally). The function calculates basic univariate statistics per vector category (cat) from the raster map.
v.rast.stats vector=vector_zones layer=1 raster=values column_prefix=val
The results are uploaded to the vector map attribute table.
you can use the raster package in R
library(raster)
v <- raster('raster filename')
z <- raster('zones raster filename')
zv <- zonal(v, z, fun=mean)
Correct me if I'm wrong, RobertH, but I believe zonal() requires that the zones are already 'rasterized' in some sense, whereas many times one will want the statistics of raster cells that fall within polygons. The various overlay methods in R within the sp package (see: ?"overlay-methods" ) are necessary for this, though if I am wrong I would be glad to hear it. I quite prefer the raster package over using SpatialGridsDataFrames, but I think one must rely on sp classes to mix polygons and gridded data. Which is ok, except becomes problematic because it lacks the great memory management of the raster package, which making point-in-polygons style operations really hard to do in R on large rasters.
I am also led to believe, but have not tried, that this can be done within GRASS, and/or through QGIS, with the next release of QGIS (1.7) to have some sort of built-in zonal stats feature.
The Rasterstats package is a nice open source tool that worked well for me:
http://blog.perrygeo.net/2013/09/24/python-raster-stats/
I started using it as a work-around because arcpy's ZonalStatistics method was producing a problematic raster that lead to an odd error when trying to convert the raster to an array (https://gis.stackexchange.com/questions/110274/save-fails-on-raster-object-created-from-numpyarraytoraster). Rasterstats worked well and provided an efficient solution for my problem.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I have a project in a course at the university that requires various Matlab functions. My version of Matlab is supplied by my workplace, and it doesn't have some of the toolboxes that are required for the project. Is there some repository of such implementations anywhere? I couldn't find anything relevant when I googled.
While my question is general, I'm listing the functions that I need and can't find, since my question is also specific:
knnclassify - an implementation of K-nearest neighbors
svmclassify - an implementation of Support Vector Machines
svmtrain - part of the SVM implementation
mapstd - normalizes a matrix to have a mean of 0 and standard deviation of 1
An alternative I'm considering is working in Python with Numpy and Pylab. Are there toolboxes in Pylab that are equivalent to these Matlab functions?
The first place I would check is the MathWorks File Exchange. There are over 10,000 code submissions from MATLAB users, and you may be able to find alternatives to the various MATLAB toolboxes. Here's something that may be helpful:
Luigi Giaccari has a few highly-rated submissions related to KNN searching here, here, and here.
Another alternative for a simpler function like MAPSTD is to try and implement a stripped-down version of it yourself. Here's some sample code that replicates the basic behavior of MAPSTD:
M = magic(5); %# Create an example matrix
rowMeans = mean(M,2); %# Compute the row means
rowStds = std(M,0,2); %# Compute the row standard deviations
rowStds(rowStds == 0) = 1; %# Set standard deviations of 0 to 1
for i = 1:size(M,1)
M(i,:) = (M(i,:)-rowMeans(i))./rowStds(i); %# Normalize each row of M
end
%# Or you could avoid the for loop with a vectorized solution...
M = bsxfun(#rdivide,bsxfun(#minus,M,rowMeans),rowStds);
This obviously won't cover all of the options in MAPSTD, but captures the basic functionality. I can confirm that the above code gives the same result as mapstd(M).
You might want to consider getting your own copies of Matlab and the toolboxes you need. Mathworks has VERY attractive pricing for University students.
GNU Octave is the free Matlab more-or-less-work-alike. I don't know how well the toolboxes are covered.
Alternatively, if the assignment needs them, the school probably has them on some lab machines somewhere, and you MIGHT be able to login remotely, using an Xterm of some kind. Ask the TA.
You could also look at R which is very strong in many data-driven fields, including Machine Learning.
Also of note is the MLOSS directory of open-source machine learning software.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
Reading well-written code seems to help me learn a language. (At least it worked with C.) [deleting the 'over-specified' part of the question]
I'm interested in particular in lisp's reputation as a language suited to creating a mini-language or DSL specific to a problem. The program ought to be open-source, of course, and available over the web, preferably.
I've Googled and found this example:
http://lispm.dyndns.org/news?ID=NEWS-2005-07-08-1
Anybody have another? (And, yes, I will continue reading "Practical Common Lisp".)
After 11 hours (only 11 hours!): Thanks, everyone. What a wonderful site, and what a bunch of good answers and tips!
I feel your constraints are over-specified:
small enough to comprehend, varied
enough to show off most of (c)lisp's
tricks and features without being
opaque (the 'well-written' part of the
wish), and independent of other
packages.
Common Lisp is a huge language, and the power set that emerges when you combine the language elements is much larger. You can't have a small program showing "most tricks" in CL.
There are also many concepts that you will find alien when you learn CL coming from another language. As such CL is less about tricks but more about its fundamental paradigms.
My suggestion is to read up on it a bit first and then start building your own programs or looking into open source code.
Edi Weitz for example usually writes good code. Check out his projects at http://www.weitz.de/.
And now go read PCL. :)
I'm kind of lazy to find the links, but you should be able to 'Google'/'Bing' it. The following list mentions very different ways to embed languages and very different embedded languages.
ITERATE for iterations
System/Module/File description in 'defsystem's, an example would be ASDF
infix readmacro
define-application-frame in CLIM for specifying user interfaces
embedded Lispified SQL queries in LispWorks and CLSQL
Knowledgeworks of LispWorks: logic language with rules, queries, ...
embedded Prolog in Allegro CL
embedded HTML in various forms
XMLisp, integrates XML and Lisp
Screamer for non-deterministic programming
PWGL, visual programming for composing music
Note that there are simple embedded languages and really complex ones that are providing whole new paradigms like Prolog, Screamer, CORBA, ...
If you haven't taken a look at it yet, the book Practical Common Lisp is available free online and has several example projects.
The LOOP macro is an almost perfect example of a DSL embedded in Common Lisp. However, since it's already part of the standard, it may not be what you're after.
CLs format function have a mini dsl.
http://cybertiggyr.com/fmt/
I think that dsl for printing strings will compile to machine code.
(format nil "~{~A~#[~:;, ~]~}" lst))
CLSQL provides a Lispy notation for SQL queries, which it compiles to SQL, and just about all Lisp HTML and XML generation libraries qualify. Metabang bind is a DSL for lexically binding variables. You probably didn't know you needed one, but it turns out to be amazingly useful.
SERIES is kind of a DSL, depending on your definition. It's in an appendix to CLTL2, though it's not actually part of the language.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I've been provided with a table of data which includes columns for latitude and longitude. The data is US only. What I've like to do is to feed this data to Google Maps or a similar tool like Live Maps and have the data points plotted.
Does anyone have a code sample or know of a library that makes this task simple? I can read the values if something else can plot them (or generate the JavaScript to do it). I'm familar with C#, PHP, Python etc so language is not a priority. My preference is something that the is simple and robust.
I've dealt with this problem before many times. While KML is one option, if you have a CSV with the latitude and longitude already in it, my quick and dirty method is to use Google Fusion Tables: http://www.google.com/fusiontables/Home
Even if you have just names of cities and towns (anywhere in the world!) it does an okay job.
If you only care about US locations, and the G20 (rich, well geocoded nations), then batch GEO is probably the best fast method, but limits you to 2500 points on the free version: http://batchgeo.com/
But for my needs, BatchGeo's method falls short. When you drill down in Kenya you see that it guesses at many locations wrong. Those 666 locations in Nairobi are slums with names that appear in no official database. Since Kenya isn't one of the top 20 tech-friendly countries, this is a common source of error. Most interesting locations appear in slums. (See also www.mapKibera.org for the effort involved in fixing blank spots in google maps)
I have a more extensive blog about this with lots of images of what your output will look like: chewychunks.wordpress.com/2011/06/09/how-to-geomap-story-locations-across-east-africa/
(but newbies are not allowed to post images here directly :( )
The best solution for me required downloading 44,000 locations from Geonames.org list for Kenya and Uganda, adding a custom slum-locations geo-lookup-list, and a python multi-step matching algorythm based on DIFFLIB and REGEX.
Answer: use this website http://www.darrinward.com/lat-long/?id=257165 and it will plot them for you.
Unrelated, screenshot is locations of #thefappening leaks. I wish leakers would scrub EXIF before publishing this crap online, this is really not right.
You can plot in Google Earth by creating a special XML document known as a KVM document. See this tutorial for details.
I plotted all of our website visitor's coordinates using this plus a GeoIP service. Really fun stuff.
CSV is widely used in Google Maps mashups. You can find the code including the CSV parser here.
This link has information about using KML and the Google Earth plugin API to display geo-referenced data in csv files.
http://www.cs.umb.edu/~joecohen/#kml
I found this page while looking for some csv files to use while teaching a class about this. If you have some good play data I'd like to see it.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I have been looking for cloud computing / storage solutions for a long time (inspired by the Google Bigtable). But I can't find a easy-to-use, business-ready solution.
I'm searching a simple, fault tolerant, distributed Key=>Value DB like SimpleDB from Amazon.
I've seen things like:
The CouchDB Project : Simple and distributed, fault-tolerant Database. But it understands only JSON. No XML connectors etc.
Eucalyptus : Nice Amazon EC2 interfaces. Open Standards & XML. But less distributed and less fault-tolerant? There are also a lot of open tickets with XEN/VMWare issues.
Cloudstore / Kosmosfs : Nice distributed, fault tolerant fs. But it's hard to configure. Are there any java connectors?
Apache Hadoop : Nice system which much more then abilities to store data. Uses its own Hadoop Distributed File System and has been testet on clusters with 2000 nodes.
*Amazon SimpleDB : Can't find an open-source alternative! It's a nice but expensive system for huge amounts of data. And you're addicted to Amazon.
Are there other, better solutions out there? Which one is the best to choose? Which one offers the smallest amount of SOF(Singe Point of Failure)?
How about memcached?
The High Scalability blog covers this issue; if there's an open source solution for what you're after, it'll surely be there.
Other projects include:
Project Voldemort
Lightcloud - Key-Value Database
Ringo - Distributed key-value storage for immutable data
Another good list: Anti-RDBMS: A list of distributed key-value stores
MongoDB is another option which is very similar to CouchDB, but using query language very similar to SQL instead of map/reduce in JavaScript. It also supports indexes, query profiling, replication and storage of binary data.
It has huge amount of documentation which might be overwhelming at fist, so I would suggest to start with Developer's tour
Wikipedia says that Yahoo both contributes to Hadoop and uses it in production (article linked from wikipedia). So I'd say it counts for business-provenness, although I'm not sure whether it counts as a K/V value database.
Not on your list is the Friendfeed system of using MySQL as a simple schema-less key/value store.
It's hard for me to understand your priorities. CouchDB is simple, fault-tolerant, and distributed, but somehow you exclude it because it doesn't have XML. Are XML and Java connectors an unstated requirement?
(Anyway, CouchDB should in fact be excluded because it's young, its API isn't stable, and it's not a key-value store.)
I use Google's Google Base api, it's Xml, free, documented, cloud based, and has connectors for many languages. I think it will fill your bill if you want free hosting too.
Now if you want to host your own servers Tokyo cabinet is your answer, its key=>value based, uses flat files, and is the fastest database out there right now (very barebones compared to say Oracle, but incredibly good at storing and accessing data, about 1 million records per second, with about 10bytes of overhead (depending on the storage engine)). As for business ready TokyoCabinet is the heart of a service called Mixi, which is the equivalent of Japan's Facebook+MyPage, with several million heavy users, so it's actually very battle proven.
If you want something like Bigtable, you can't go past HBase or Hypertable - they're both open-source Bigtable clones. One thing to consider, though, is if your requirements really are 'big enough' for Bigtable. It scales up to thousands of tablet servers, and as such, has quite a bit of infrastructure under it to enable that (for example, handling the expectation of regular node failures).
If you don't anticipate growing to, at the very least, tens of tablet servers, you might want to consider one of the proposed alternatives: You can't beat BerkelyDb for simplicity, or MySQL for ubiquity. If all you need is a key/value datastore, you can put a simple 'dict' wrapper around your database interface, and switch out your backend if you outgrow one.
You might want to look at hypertable which is modeled after google's bigtable.
Use The CouchDB
Whats wrong with JSON?
JSON to XML is trivial
You might want to take a look at this (using MySQL as key-value store):
http://bret.appspot.com/entry/how-friendfeed-uses-mysql
Cloudera is a company that commercializes Apache Hadoop, with some value-add of course, like productization, configuration, training & support services.
Instead of looking for something inspired by Google's bigtable- Why not just use bigtable directly? You could write a front-end on Google App-Engine.
Good compilation of storage tools for your question :
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/
Tokyo Cabinet has also received some attention as it supports table schemas, key value pairs and hash tables. It uses Lua as an embedded scripting platform and uses HTTP as it's communication protocol Here is an great demonstration.