Classification tree tool and integration with a web site - json

A general question. I have used i.e. Weka classifier model functionallity in their tool. But is there a way to "call Weka" and get a model in response from a website?
It is not important that it is Weka, but I want to implement some simple classification based on a json coming from a web-site.
Thanks.

You can write a REST webservice in Java which loads your model and makes predictions using data it receives, sending back the predictions in a suitable format. There are a number of frameworks for writing such webservices (e.g., JAX-RS).
In terms of using the Weka API, check out the Use Weka in your Java code article.

Related

CSV to JSON benchmarks

I'm working on a project that uses parallel methods to convert text from one form to another. We're going to implement a CSV to JSON converter to demonstrate the speedups that are possible using our parallel framework.
We want to benchmark our converter once it's finished. What are the fastest libraries/stand-alone programs/etc out there that are capable of doing CSV-JSON conversion? I found a list of potential candidates here:Large CSV to JSON/Object in Node.js, but I'm not sure how fast the listed options are. In the worst case I'll benchmark them myself, but if someone already knows what the "best in class" converters are it'd save me some time.
Looks like the maintainer of csvtojson has developed a benchmark application. I think I can add my csv to json converter to his benchmark project to test my converter.
if your project can consider in-browser apps, I suggest csvtojson as it is by far the speediest converter on the market as of 2017.
I created it myself so I may be a bit biaised, but I specifically developed it for a bigger project that required big csv to json crunching.
Tell me if it served.

what does WTX (websphere transformation extender)

I'm not finding a lot of information on the internet about what is and what does WTX.
Can you give me some lights about it, and give me an example?
Also, the EAI BizTalk Server from microsoft is related to wtx?
thanks
WTX is a data transformation tool, for exmaple it will transform CSV into XML, one flavour of XML (e.g. RosettaNet) into another (e.g. an application format) etc.
It is owned and developed by IBM.
Transformations are referred to as maps and are built by dragging and dropping fields in the Eclipse based design tool.
The WTX runtime engine can be called in a variety of ways - e.g. through the Java API, the TX Launcher, IBM Integration Bus, etc.
WTX has been previously known as Mercator and DataStage TX
EAI BizTalk Server is also a tool for integrating applications and transforming data. I'm not aware of any adapters between the two tools, but you could develop one using the WTX API.
WTX is the one of the powerful data transformation tool present in the integration industry .
Generally XML data transformation is done in may ways like XSLT ect . but when it comes to EDI , stander Messages like EDFACT ASCII transformation logic will be very difficult in other tools .
WTX can easily handle these type of transformation and its very flexible tool for transformation for any type of message.
EX: Lets consider two systems communicating and system A send data in IDOC , systmem B wants in XML . You can use a WTX map in between and achieve this . Also you can handle business rules in WTX.

Weka: Limitations on what one can output as source?

I was consulting several references to discover how I may output trained Weka models into Java source code so that I may use the classifiers I am training in actual code for research applications I have been developing.
As I was playing with Weka 3.7, I noticed that while it does output Java code to its main text buffer when use simpler classification (supervised in my case this time) methods such as J48 decision tree, it removes the option (rather, it voids it by removing the ability to checkmark it and fades the text) to output Java code for RandomTree and RandomForest (which are the ones that give me the best performance in my situation).
Note: I am clicking on the "More Options" button and checking "Output source code:".
Does Weka not allow you to output RandomTree or RandomForest as Java code? If so, why? Or if it does and just doesn't put it in the output buffer (since RF is multiple decision trees which I imagine it doesn't want to waste buffer space), how does one go digging up where in the file system Weka outputs java code by default?
Are there any tricks to get Weka to give me my trained RandomForest as Java code? Or is Serialization of the output *.model files my only hope when it comes to RF and RandomTree?
Thanks in advance to those who provide help.
NOTE: (As an addendum to the answer provided below) If you run across a similar situation (requiring you to use your trained classifier/ML model in your code), I recommend following the links posted in the answer that was provided in response to my question. If you do not specifically need the Java code for the RandomForest, as an example, de-serializing the model works quite nicely and fits into Java application code, fulfilling its task as a trained model/hardened algorithm meant to predict future unlabelled instances.
RandomTree and RandomForest can't be output as Java code. I'm not sure for the reasoning why, but they don't implement the "Sourceable" interface.
This explains a little about outputting a classifier as Java code: Link 1
This shows which classifiers can be output as Java code: Link 2
Unfortunately I think the easiest route will be Serialization, although, you could maybe try implementing "Sourceable" for other classifiers on your own.
Another, but perhaps inconvenient solution, would be to use Weka to build the classifier every time you use it. You wouldn't need to load the ".model" file, but you would need to load your training data and relearn the model. Here is a starters guide to building classifiers in your own java code http://weka.wikispaces.com/Use+WEKA+in+your+Java+code.
Solved the problem for myself by turning the output of WEKA's -printTrees option of the RandomForest classifier into Java source code.
http://pielot.org/2015/06/exporting-randomforest-models-to-java-source-code/
Since I am using classifiers with Android, all of the existing options had disadvantages:
shipping Android apps with serialized models didn't reliably work across devices
computing the model on the phone took too much resources
The final code will consist of three classes only: the class with the generated model + two classes to make the classification work.

JSON, REST, SOAP, WSDL, and SOA: How do they all link together

Currently doing some exams and I'm struggling through some concepts. These have all been 'mentioned' in my notes really but I didn't really understand how they all linked together. As far as my understanding is:
SOA - a solution to make service consumers/providers communicate. (as far as I understand this is the umbrella term for everything else)
WSDL - A language that describes the provider service.
SOAP - A XML protocol 'wrapper' used by the services to send messages. Works in conjunction with WSDL as to provide parameters?
REST - A design pattern that is similar to SOAP in function but avoids the XML? (really not sure about this one)
JSON - An alternative to XML that uses javascript? (not sure about this one either)
Looking around in the internet there doesn't seem to be a clear definition of what all of these are and how they interlink.
Imagine you are developing a web-application and you decide to decouple the functionality from the presentation of the application, because it affords greater freedom.
You create an API and let others implement their own front-ends over it as well. What you just did here is implement an SOA methodology, i.e. using web-services.
Web services make functional building-blocks accessible over standard
Internet protocols independent of platforms and programming languages.
So, you design an interchange mechanism between the back-end (web-service) that does the processing and generation of something useful, and the front-end (which consumes the data), which could be anything. (A web, mobile, or desktop application, or another web-service). The only limitation here is that the front-end and back-end must "speak" the same "language".
That's where SOAP and REST come in.
They are standard ways you'd pick communicate with the web-service.
SOAP:
SOAP internally uses XML to send data back and forth. SOAP messages have rigid structure and the response XML then needs to be parsed.
WSDL is a specification of what requests can be made, with which parameters, and what they will return. It is a complete specification of your API.
REST:
REST is a design concept.
The World Wide Web represents the largest implementation of a system
conforming to the REST architectural style.
It isn't as rigid as SOAP. RESTful web-services use standard URIs and methods to make calls to the webservice. When you request a URI, it returns the representation of an object, that you can then perform operations upon (e.g. GET, PUT, POST, DELETE). You are not limited to picking XML to represent data, you could pick anything really (JSON included)
Flickr's REST API goes further and lets you return images as well.
JSON and XML, are functionally equivalent, and common choices. There are also RPC-based frameworks like GRPC based on Protobufs, and Apache Thrift that can be used for communication between the API producers and consumers. The most common format used by web APIs is JSON because of it is easy to use and parse in every language.
WSDL: Stands for Web Service Description Language
In SOAP(simple object access protocol), when you use web service and add a web service to your project, your client application(s) doesn't know about web service Functions. Nowadays it's somehow old-fashion and for each kind of different client you have to implement different WSDL files. For example you cannot use same file for .Net and php client.
The WSDL file has some descriptions about web service functions. The type of this file is XML. SOAP is an alternative for REST.
REST: Stands for Representational State Transfer
It is another kind of API service, it is really easy to use for clients. They do not need to have special file extension like WSDL files. The CRUD operation can be implemented by different HTTP Verbs(GET for Reading, POST for Creation, PUT or PATCH for Updating and DELETE for Deleting the desired document) , They are based on HTTP protocol and most of times the response is in JSON or XML format. On the other hand the client application have to exactly call the related HTTP Verb via exact parameters names and types. Due to not having special file for definition, like WSDL, it is a manually job using the endpoint. But it is not a big deal because now we have a lot of plugins for different IDEs to generating the client-side implementation.
SOA: Stands for Service Oriented Architecture
Includes all of the programming with web services concepts and architecture. Imagine that you want to implement a large-scale application. One practice can be having some different services, called micro-services and the whole application mechanism would be calling needed web service at the right time.
Both REST and SOAP web services are kind of SOA.
JSON: Stands for javascript Object Notation
when you serialize an object for javascript the type of object format is JSON.
imagine that you have the human class :
class Human{
string Name;
string Family;
int Age;
}
and you have some instances from this class :
Human h1 = new Human(){
Name='Saman',
Family='Gholami',
Age=26
}
when you serialize the h1 object to JSON the result is :
[h1:{Name:'saman',Family:'Gholami',Age:'26'}, ...]
javascript can evaluate this format by eval() function and make an associative array from this JSON string. This one is different concept in comparison to other concepts I described formerly.

Custom jena-json writer

I'm creating a web apllication and i want to load a json file to a visualization library. the thing is the json file needs to be in a certain format.
I'm using jena to get data in a json file that is in the TALIS format. How can i get the data writen in a custom format?Is it easier to first get them in talis and then transform them or get them in the desired form from the beginning?
I'd appreciate every possible help!
You don't say how you are serving your data to the client-side JavaScript application. I'm going to take a guess, and assume you are using Jena Fuseki to serve the data. If that's not a correct guess, you'll need to update the question to be more precise about your setup.
I don't think that Fuseki currently supports pluggable writers. So your best solution would be to apply a transformation in the client-side JavaScript to turn the JSON you get from the server into the format that's needed by the visualisation library. I've done this myself in a number of rich-client applications that consume RDF data. I usually find that I would need to apply client-side transform code in any case - often it's not just a difference in the format of the JSON, but also that you need to project some slice or aggregation of the data that's just easier to express in JavaScript rather than in SPARQL or equivalent.