Can I use Gymnasium with Ray's RLlib? - reinforcement-learning

Can I use Gymnasium with Ray's RLlib? - reinforcement-learning

I want to develop a custom Reinforcement Learning environment. Previously, I have been working with OpenAI's gym library and Ray's RLlib. I noticed that the README.md in the Open AI's gym library suggests moving to Gymnasium # (https://github.com/Farama-Foundation/Gymnasium). But I have yet to find a statement from Ray on using Gymnasium instead of gym.
Will I have problems using Gymnasium and Ray's RLlib?

Yes you will at the moment. One difference is that when performing an action in gynasium with the env.step(action) method, it returns a 5-tuple - the old "done" from gym<0.24.1 has been replaced with two final states - "truncated" or "terminated".
There is an outstanding issue to integrate gymnasium into rllib and I expect this issue will be resolved soon: https://github.com/ray-project/ray/issues/29697

Related

Use MiniGrid environment with stable-baseline3

I'm using MiniGrid library to work with different 2D navigation problems as experiments for my reinforcement learning problem. I'm also using stable-baselines3 library to train PPO models. but unfortunately, while training PPO using stable-baselines3 with MiniGrid environment, I got the following error.
Error
I imported the environment as follows,
import gymnasium as gym
from minigrid.wrappers import RGBImgObsWrapper
env = gym.make("MiniGrid-SimpleCrossingS9N1-v0", render_mode="human")
env = RGBImgObsWrapper(env)
env = ImgObsWrapper(env)
The training script using stable-baseline3 is as follows,
model = PPO('CnnPolicy', env, verbose=0)
model.learn(args.timesteps, callback)
I have done a quick debug and found a potential lead. I don't know if this is the real cause.
When I tried to load the environment directly from the gym the action space <class 'gym.spaces.discrete.Discrete'>. But when loaded from MiniGrid the action space is <class 'gymnasium.spaces.discrete.Discrete'>. Any help to sort out the problem is highly appreciated. Thanks in advance

That's the correct analysis, Stable Baselines3 doesn't support Gymnasium yet, so checks on gym.spaces.discrete.Discrete fail against gymnasium.
The following post answer explains how to workaround that, based on a currently open PR: OpenAI Gymnasium, are there any libraries with algorithms supporting it?.

How can I start writing the code for my layer?

I have seen that researchers are adding some functionalities to the original version of Caffe and use those layers and functionalities according to what they need and then these versions are shared through Github. If I am not mistaken, there are two ways: 1) by recompiling Caffe after adding c++ and Cuda versions of layers. 2) writing a python code for the functionality and call it as python layer in Caffe.
I want to add a new layer to Caffe based on my research problem. I really do not from which point should I start writing the new layer and which steps I should consider.
My questions are:
1) Is there any documentation or any learning materials that I can use it for writing the layer?
2) Which way of above-mentioned methods of adding a new layer is preferred?
I really appreciate any help and guidance
Thanks a lot

For research purposes, for "playing around", it is usually more convenient to write a python layer: saves you the hustle of compiling etc.
You can find a short tutorial on "Python" layer here.
On the other hand, if you want better performance you should write a native c++ code for your layer.
You can find a short explanation about it here.

Correct design using dependency inversion principle across modules?

I understand dependency inversion when working inside a single module, but I would like to also apply it when I have a cross-module dependency. In the following diagrams I have an existing application and I need to implement some new requirements for reference data services. I thought I will create a new jar (potentially a stand-alone service in the future). The first figure shows the normal way I have approached such things in the past. The referencedataservices jar has an interface which the app will use to invoke it.
The second figure shows my attempt to use DIP, the app now owns its abstraction so it is not subject to change just because the reference data service changes. This seems to be a wrong design though, because it creates a circular dependency. MyApp depends on referencedataservices jar, and referencedataservices jar depends on MyApp.
So the third figure gets back to the more normal dependency by creating an extra layer of abstraction. Am I right? Or is this really not what DIP was intended for? Interested in hearing about other approaches or advice.
,

The second example is on the right track by separating the implementation from its abstraction. To achieve modularity, a concrete class should not be in the same package (module) as its abstract interface.
The fault in the second example is that the client owns the abstraction, while the service owns the implementation. These two roles must be reversed: services own interfaces; clients own implementations. In this way, the service presents a contract (API) for the client to implement. The service guarantees interaction with any client that adheres to its API. In terms of dependency inversion, the client injects a dependency into the service.
Kirk K. is something of an authority on modularity in Java. He had a blog that eventually turned into a book on the subject. His blog seems to be missing at the moment, but I was able to find it in the Wayback Machine. I think you would be particularly interested in the four-part series titled Applied Modularity. In terms of other approaches or alternatives to DIP, take a look at Fun With Modules, which covers three of them.

In second approach that you presented, if you move RefDataSvc abstraction to separate package you break the cycle and referencedataservices package use only package with RefDataSvc abstraction.
Other code apart from Composition Root in MyApp package should depend also on RefDataSvc. In Composition Root of your application you should then compose all dependencies that are needed in your app.

Any open octave development projects for mathematician/physicist programmers versus classdef?

At first I was excited about working on open development projects for Octave related to implementing programs heavy in mathematics and physics, such as delaunayTriangulation class, but after talking to a few octave maintainers I have come to the sad conclusion that Octave will be complete after classdef is complete, at which point physics or mathematician like programmers will no longer be needed to build new functionality to Octave. Is this true?

I have followed your thread on the Octave maintainers mailing list and I think you have misunderstood this quite badly.
Once classdef gets implemented, the problems won't be solved, quite the contrary. It will allow for many problems to be solved, which can't be done just yet in a Matlab compatible way. There are 2 things here:
you may have felt that there's no problems left to solve after seeing many suggestions of libraries that already solve the problem. That doesn't mean they will be used. Even if licensing allows it, there comes a point where having to "reshape" the data in Octave into whatever form the other library uses it, is just too much and a native interface is preferred. This is specially true in Octave because it's mostly written in the Octave language which allows for users to participate in its development.
Even if an external library is used in the end, remember that "the devil is in the details". Implementing an interface between Octave and an external library is not a trivial problem.
When classdef is complete, the work will start, not finish. And classdef is already working on the development version, so if you are interested in those classes, you could start implementing them there and they'd be released with the next version. To continue development of classdef, Octave needs that people it, so that it's problems can be found. And the delaunayTriangulation class requires classdef. It looks like a great pair, that should be developed together.

Weka: Limitations on what one can output as source?

I was consulting several references to discover how I may output trained Weka models into Java source code so that I may use the classifiers I am training in actual code for research applications I have been developing.
As I was playing with Weka 3.7, I noticed that while it does output Java code to its main text buffer when use simpler classification (supervised in my case this time) methods such as J48 decision tree, it removes the option (rather, it voids it by removing the ability to checkmark it and fades the text) to output Java code for RandomTree and RandomForest (which are the ones that give me the best performance in my situation).
Note: I am clicking on the "More Options" button and checking "Output source code:".
Does Weka not allow you to output RandomTree or RandomForest as Java code? If so, why? Or if it does and just doesn't put it in the output buffer (since RF is multiple decision trees which I imagine it doesn't want to waste buffer space), how does one go digging up where in the file system Weka outputs java code by default?
Are there any tricks to get Weka to give me my trained RandomForest as Java code? Or is Serialization of the output *.model files my only hope when it comes to RF and RandomTree?
Thanks in advance to those who provide help.
NOTE: (As an addendum to the answer provided below) If you run across a similar situation (requiring you to use your trained classifier/ML model in your code), I recommend following the links posted in the answer that was provided in response to my question. If you do not specifically need the Java code for the RandomForest, as an example, de-serializing the model works quite nicely and fits into Java application code, fulfilling its task as a trained model/hardened algorithm meant to predict future unlabelled instances.

RandomTree and RandomForest can't be output as Java code. I'm not sure for the reasoning why, but they don't implement the "Sourceable" interface.
This explains a little about outputting a classifier as Java code: Link 1
This shows which classifiers can be output as Java code: Link 2
Unfortunately I think the easiest route will be Serialization, although, you could maybe try implementing "Sourceable" for other classifiers on your own.
Another, but perhaps inconvenient solution, would be to use Weka to build the classifier every time you use it. You wouldn't need to load the ".model" file, but you would need to load your training data and relearn the model. Here is a starters guide to building classifiers in your own java code http://weka.wikispaces.com/Use+WEKA+in+your+Java+code.

Solved the problem for myself by turning the output of WEKA's -printTrees option of the RandomForest classifier into Java source code.
http://pielot.org/2015/06/exporting-randomforest-models-to-java-source-code/
Since I am using classifiers with Android, all of the existing options had disadvantages:
shipping Android apps with serialized models didn't reliably work across devices
computing the model on the phone took too much resources
The final code will consist of three classes only: the class with the generated model + two classes to make the classification work.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Can I use Gymnasium with Ray's RLlib? - reinforcement-learning

Related

Use MiniGrid environment with stable-baseline3

How can I start writing the code for my layer?

Correct design using dependency inversion principle across modules?

Any open octave development projects for mathematician/physicist programmers versus classdef?

Weka: Limitations on what one can output as source?

Categories

Resources