Json library for Cloud Dataproc - json

I need to find a json library for Google Cloud Dataproc.
I'm a bit not sure where can find a list of supported json libraries.
Or if I write my own, which dependencies can be taken into Dataproc?
Any data on this topic will be highly appreciated.
Best Regards,
Oleg

If you are talking about reading/parsing JSON objects, than you can use Gson library witch is a part of Hadoop distribution on Dataproc.
Also, you can use JSON library of your choice and any other dependencies, but you should create uber jar for your job and include all these libraries/dependencies into it.
If you are talking about Google JSON API Client libraries, than Dataproc by default deploys 1.20.0 version as part of GCS and BQ connectors. You still can use newer JSON API Client library version if you will relocate it inside your job's uber jar to avoid conflicts with version deployed to Dataproc.
See more detailed answer on conflicting dependencies management in Dataproc here.

Related

Autodesk Forge Data Management DotNet Package

I wonder if there is a forge data management library for dotnet like design automation package.
https://github.com/Autodesk-Forge/forge-api-dotnet-design.automation
So I could use data management to store the files used in the design automation in an easy way.
I found some samples, but not compact in a DataManagementClient class.
I created my own package with a simple implementation for the OSS.
https://www.nuget.org/packages/ricaun.Autodesk.Forge.Oss
https://github.com/ricaun-io/forge-api-dotnet-oss
We have a .NET SDK that handles Data Management API (among other services).
You can refer to nuget and github.
In this SDK we have classes for each specific Data Management entity, such as:
Hub
Project
Folder
Item
Version
The most recent version (1.9.7) also covers a simple way to handle binary transfers, as described in this blog post.
We are also looking for early adopters of the alpha version of a new SDK for the APS OSS service (refer here)

Export a KNIME workflow as a standalone application or JAR

Is there a way to export or compile a KNIME workflow as a standalone Java application or JAR? I'd like to run the workflow on a platform where KNIME cannot be installed and/or as part of a larger program to simplify a complex but isolated piece of analytics. My options are many, but installing KNIME on the target platform is not one of them.
The only relevant reference I can find on the KNIME site is this ten-year-old(!) forum question. The only answer there links to this project which does seem to be active and says it is 'a KNIME extension that exports KNIME workflows to different workflow engines', though without digging into its code it's not clear what engines those are.
Other than that, I guess your options are:
ask on the KNIME forum again
since KNIME is open source and is based on Eclipse, look into the more general question of how to build and run a minimal standalone version of Eclipse - there seem to be some relevant-looking answers on here if you search, but I have no further knowledge on how to do it
use scripting nodes in KNIME to develop a text language version of your workflow, verifying as you go that the output adequately matches the KNIME nodes at each step, and deploy the text language version to your target. If you need data mining methods you might want to look at the Weka integration nodes which you could then substitute with calls to Weka methods.

How to launch Google Compute instances programmatically?

In the AWS SDK, EC2 instances can be launched programmatically via the AmazonEC2Client. Does GCP in general or Compute Engine specifically just offer the CLI-based gcloud command for the equivalent operation? Or can GCE instances be controlled from Java/Python/Go/etc as well? Which SDK exists for those languages and where are the examples & docs for this?
I am looking for the equivalent of this in the GCP world:
client = new AmazonEC2Client(credentials);
client.runInstances(new RunInstancesRequest())
You're looking for the Google Cloud Client Libraries, of which gcloud-java is the Java implementation. There are also Client Libraries in Go, Node.js, Python, and Ruby.
Under the covers everything in Google Cloud is available via an API, so even if there isn't a client library for what you're trying to accomplish, it can be done programmatically by calling the API directly.
The documentation on launching instances has an API tab that shows both Client Library and REST API examples.
Looks like this is still in alpha but it is available on Github: here and here. There is an example for starting GCE instances in the java-docs-samples project.

Framework for Mapserver GIS APP

I'm trying to display on the web (read as create a GIS Web app) topo data layers stored in a POSTGIS/POSTGRES spatial database using mapserver. My problem is, although i happened to come across different gis frameworks that I could use, my lack of experience on using mapserver in the first place makes me indecisive of which framework to use. So what is the easiest framework out there to use? I'm using a MS4W pre-packaged mapserver binaries, and i've installed almost all of the additional packages (frameworks) from their site.
Thanks for the help!:) I
Mapfish (I think python and C based and ideal with mapserver)
GeoServer (java based)
Featureserver (RESTful, light and effective)
Other interesting links:
GEOEXT provides an excellent extension for openlayers
Boston GIS provides excellent tutorials so does Paul Ramsey and Chris Schmidt
The mother GIS - Free Open Source Software OSGEO
FreeGIS - Continually updated list of free and related GIS software
I've used Geoserver and Feature server on multiple occasions, and never got deep into mapserver. I know that Mapserver has a big community and they love helping out, check them out on IRC and their mailing list.
We have developed an interface called OWGIS for displaying GIS data.
Website: http://www.owgis.org
Description:
The OWGIS (Open WebGIS) is an OpenSource Java Servlets web application that creates WebGIS sites by automatically writing HTML and JavaScript code. The WebGIS sites are configured by XML files that define which layers will be displayed on the maps as well as the texts to be used on the interface. OWGIS's most notable features include animations, veritcal profiles and vertical transects, various color palettes, dynamic maps, downloadable data, and multilingual interfaces. All these features are created automatically without any additional web programming.
Since you already got MS4W installed. The easiest way to publish a map service from Postgres is from MapServer which is component of the ms4w.
To start publish wms from MapServer,
1) Read thru the documentation of Mapfile which is the service definition file of how the WMS configured.
2) Read the ogr postgis connection documentation. You would be able to write the database connection follow the instruction pretty easily.
3) Once you got the valid Mapfile with correct postgis connection string info, you are able to publish the WMS for your topos.
MapServer is very powerful and easy to use. The file based service provide a lot of flexibility which is critical when you need publish something dynamically.
GeoServer is very popular too and has a gui which is extremely easy to use, by several click and your services are ready to go.
Other solutions are also available as well. But consider the community user base and tech support. I would recommend using MapServer or Geoserver for your case. We had our Mapserver holding USGS topo services as well, which is very stable,flexible and salable so just some FYI.
Hope it is helpful.

Why do the terms API and an SDK seem to be used interchangeably?

What is the accepted definition of an API compared to the definition of an SDK? Both seem to be used interchangeably, so I'd imagine that some libraries dubbed APIs are in reality SDKs, and vice versa. Also, is there a reason for this distinction?
Thanks!
An API, or application programming interface, defines a set of classes, functions, and structures to be used by an application to make use of some library or subsystem. For example, both the windows multimedia subsystem, and windows sockets subsystem both have their own API. An API is not a concrete entity, you can't point at a file and say that the file itself is an API. An API is merely a specification for a communications protocol that a program needs to use to make use of a library or subsystem.
An SDK, or software development kit, contains tools, documentation, and needed files, to program against 1 or more APIs. Some SDKs, but by no means all, may contain sample code to demonstrate how an API can be used. 2 examples of an SDK are the Windows Platform SDK and the .NET Framework SDK.
The most likely reason the terms are used interchangeably is because sometimes an SDK only has the tools, documentation, and files for a single API, and both the API and SDK share the same name. An example of this would be the SDK for developing winamp plugins.
API - Application Programming Interface. This is what you write code to.
SDK - Software Development Kit. These are the libraries that you need so you can code. An SDK likely has many different api's contained in it.
None of the answers I've seen so far really capture it clearly and completely.
The two terms are not interchangeable and if someone uses them interchangeably, it indicates a lack of understanding or preciseness.
API - Application Programming Interface. Exposed by a class library. Utilized by an application. When you build an application that uses the library, this is the interface your code uses to connect to or call into the library. In other words the set of rules and conventions applications must follow to use the library. The API includes the classes, their methods, the parameter lists for the methods, and other supporting elements (like enumerations, constants and so on). An API is an abstract artifact: You cannot download an API, or install an API. You can describe an API, and you can use one. The library itself is the concrete realization of the API, the delivery mechanism.
SDK - Software Development Kit. This is a concrete thing. You can download an SDK, install it, store it. An SDK includes:
libraries, which, as you know, expose or provide APIs.
header files (if applicable)
documentation - readme files, help files, release notes, reference documentation, programming guides. Realized as .CHM files, pdf documents, etc.
tools - such as compilers, assemblers, linkers, profilers, debuggers, optmizers, test tools, and more.
sample code and sample apps - showing how to use the API
maybe some other stuff that doesn't fit into one of the above categories
A Concrete Example:
the Java SDK is a downloadable, versioned thing. It delivers libraries, tools (on windows: javac.exe, java.exe, jar.exe, etc), all the help files and API doc, and source code for the libraries.
the API for Java is the set of rules your code must follow to invoke the libraries; these rules are described in the API documentation.
An API is an Application Programming Interface -- its something your program can talk to. An SDK usually includes an API along with documentation for the API. An API is not required to contain documentation. An SDK may include the API Components, but will always include the documentation.
APIs can exist within SDKs but not vice versa. SDKs generally contain a complete specifiction of a framework or environment. For example the Java SDK contains a full specification of the Java language plus tools, external libraries and whatever else the vendor decides to throw in there. The java apis are simply the interface to those specifications.