Can i export models from Rapidminer (similar to Datarobot Prime)? - rapidminer

I have seen some automl tools being able to export the models (including the features) as an approximate model in python. For example Datarobot has Prime which is pretty cool.
Is this something we can do in Rapidminer as well ?

you have several options here, depending on your actual use case:
In RapidMiner you can store any model in your repository and run it on any other RapidMiner instance with the generic Apply Model Operator.
For most of the models you can use the pmml extension to export it in a common format.
If you are interested in the parameters and the description of the models, the converters extension has operators to transform a model into an example set.

Related

Pros and cons of pydantic compared to json schemas

As far as I understand Pydantic and Json Schemas provide similar functionality - both can be used for validating data outputs.
I am interested to understand the pros and cons of using each one. A few questions I am interested in:
Are there any differences in accruacy between them?
Which one is faster to implement in terms of development time?
Is there any functionality difference between the two? i.e. features one supports, that the other doesn't?
These are only examples of questions I am thinking about, I would love to know more about the pros and cons also.
WHile both Pydantic and Json Schema are used to verify data adheres to a certain format they're serve different use-cases:
Json Schema: a tool for defining JSON structures independent of any implementation or programming language.
Pydantic: a python specific tool for validating input data against a pydantic specific definition
You can find many implementations of Json Schema validator in many languages those are the tools that you might want to check out in a 1:1 comparison to pydantic. However, pydantic understands Json Schema: you can create pydantic code from Json Schema and also export a pydantic definition to Json Schema. They should be equivalent from a functional perspective. You can find a type mapping in the pedantic docs.
So, which should you use? Your use-case is important but most likely its not either/or. If you're python-only and prefer to define your schema in python directly definitely go for pydantic. If you need to exchange the schemas across languages or want to handle schemas generated somewhere else, you can add Json Schema on top and pydantic will be able to handle it.

Using OpenVino pre-trained models with AWS Sagemaker

I'm looking to deploy a pre-trained model for real-time pedestrian and/or vehicle detection using the AWS Sagemaker workflow, I particularly want to use Sagemaker Neo to compile the model and deploy it on the edge. I want to use one of OpenVino's prebuilt models from their model zoo, but when I download the model it is already in their Intermediate Representation (IR) format for their own optimizer.
Is there a way to get an OpenVino pre-trained model not in IR format so that I can use it in sagemaker? Or any possible way to containerize the OpenVino model for use in sagemaker?
If not, are there any free pre-trained models (using any of the popular frameworks like pytorch, tensorflow, ONXX, etc.) that I can use for vehicle detection from a traffic camera POV? AWS Marketplace does not seem to have much to offer in this regard.
Answers to the query as below:
No.Only in Intermediate Representation (IR) format.
There are a few OpenVINO pre-trained models available for vehicle detection.Check out the list of Object Detection Models that are relevant for vehicle detection on these Github pages.
https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/index.md
https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/public/index.md

Building a Pipline Model using allennlp

I am pretty new to allennlp and I am struggling with building a model that does not seem to fit perfectly in the standard way of building model in allennlp.
I want to build a pipeline model using NLP. The pipeline consists basically of two models, let's call them A and B. First A is trained and based on the prediction of the full train A, B trained afterwards.
What I have seen is that people define two separate models, train both using the command line interface allennlp train ... in a shell script that looks like
# set a bunch of environment variables
...
allennlp train -s $OUTPUT_BASE_PATH_A --include-package MyModel --force $CONFIG_MODEL_A
# prepare environment variables for model b
...
allennlp train -s $OUTPUT_BASE_PATH_B --include-package MyModel --force $CONFIG_MODEL_B
I have two concerns about that:
This code is hard to debug
It's not very flexible. When I want to do a forward pass of the fully trained model I have write another script that bash script that does that.
Any ideas on how to do that in a better way?
I thought about using a python script instead of a shell script and invoke allennlp.commands.main(..) directly. Doing so at least you have a joint python module you can run using a debugger.
There are two possibilities.
If you're really just plugging the output of one model into the input of another, you could merge them together into one model and run it that way. You can do this with two already-trained models if you initialize the combined model with the two trained models using a from_file model. To do it at training time is a little harder, but not impossible. You would train the first model like you do now. For the second step, you train the combined model directly, with the inner first model's weights frozen.
The other thing you can do is use AllenNLP as a library, without the config files. We have a template up on GitHub that shows you how to do this. The basic insight is that everything you configure in one of the Jsonnet configuration files corresponds 1:1 to a Python class that you can use directly from Python. There is no requirement to use the configuration files. If you use AllenNLP this way, have much more flexibility, including chaining things together.

Ray RLllib: Export policy for external use

I have a PPO policy based model that I train with RLLib using the Ray Tune API on some standard gym environments (with no fancy preprocessing). I have model checkpoints saved which I can load from and restore for further training.
Now, I want to export my model for production onto a system that should ideally have no dependencies on Ray or RLLib. Is there a simple way to do this?
I know that there is an interface export_model in the rllib.policy.tf_policy class, but it doesn't seem particularly easy to use. For instance, after calling export_model('savedir') in my training script, and in another context loading via model = tf.saved_model.load('savedir'), the resulting model object is troublesome (something like model.signatures['serving_default'](gym_observation) doesn't work) to feed the correct inputs into for evaluation. I'm ideally looking for a method that would allow for easy out of the box model loading and evaluation on observation objects
Once you have restored from checkpoint with agent.restore(**checkpoint_path**), you can use agent.export_policy_model(**output_dir**) to export the model as a .pb file and variables folder.

list of fastai models

FastAI uses AWD-LSTM for text processing. They provide pretrained models with get_language_model(). But I can't find proper documentation on what's available.
Their github example page is really a moving target. Model names such as lstm_wt103 and WT103_1 are used. In the forums I found wt103RNN.
Where can I find an updated list of pretrained models and their download URLs ?
URLs is defined in fastai.datasets, there are constants for two models: WT103, WT103_1.
AWS bucket has just the two models.