Accessing training metrics in stable-baselines3 - reinforcement-learning

Is it possible to access the A2C total loss and whether the environment truncated or terminated within a custom callback?
I'd like to access truncated and terminated in _on_step. That would allow me to terminate training when the environment truncates, and also allow me to record training episode durations. I'd also like to be able to record total loss, I assume in _on_rollout_end?

You need to attach a callback that implements _on_step method that returns a bool by checking your env's variables. Something like this (I always check my env for being a VecEnv since it has a bit different way of accessing its variables in compare to non-vectorized one):
class StopOnTruncCallback(BaseCallback):
def __init__(self, verbose: int = 0):
super().__init__(verbose)
def _on_step(self):
return self._is_trunc()
def _is_trunc(self):
if isinstance(self.training_env, VecEnv):
return self.training_env.get_attr("truncated")[0]
else:
return self.training_env.truncated

Related

Add a TensorBoard metric from my PettingZoo environment

I'm using Tensorboard to see the progress of the PettingZoo environment that my agents are playing. I can see the reward go up with time, which is good, but I'd like to add other metrics that are specific to my environment. i.e. I'd like TensorBoard to show me more charts with my metrics and how they improve over time.
The only way I could figure out how to do that was by inserting a few lines into the learn method of OnPolicyAlgorithm that's part of SB3. This works and I got the charts I wanted:
(The two bottom charts are the ones I added.)
But obviously editing library code isn't a good practice. I should make the modifications in my own code, not in the libraries. Is there currently a more elegant way to add a metric from my PettingZoo environment into TensorBoard?
You can add a callback to add your own logs. See the below example. In this case the call back is called every step. There are other callbacks that you case use depending on your use case.
import numpy as np
from stable_baselines3 import SAC
from stable_baselines3.common.callbacks import BaseCallback
model = SAC("MlpPolicy", "Pendulum-v1", tensorboard_log="/tmp/sac/", verbose=1)
class TensorboardCallback(BaseCallback):
"""
Custom callback for plotting additional values in tensorboard.
"""
def __init__(self, verbose=0):
super(TensorboardCallback, self).__init__(verbose)
def _on_step(self) -> bool:
# Log scalar value (here a random variable)
value = np.random.random()
self.logger.record('random_value', value)
return True
model.learn(50000, callback=TensorboardCallback())

Pytest: Create reusable code in conftest.py

Due to the way pytest works, it is not possible (or recommended) to import other modules in a pytest module. Instead, one should properly edit it's conftest.py file.
Several times, I am put in a situation where I need to share constants/functions to several tests modules. And fixture fails to be as practical as functions. Even if they can be indirectly parametrized with the indirect parameter, they are still situations where it's not possible, or simple, to use this approach.
For constants, I am in the following situation, here is an extract of my conftest.py:
TARGET_NAME_1 = 'MY_OP4510'
TARGET_NAME_2 = 'MY_ML605'
TARGET_NAME_3 = 'TARGET_WITH_CHILD'
CONFIG_FILE_NAME = 'config.ini'
#pytest.fixture()
def target_name_1():
"""This fixture returns a target name"""
return TARGET_NAME_1
#pytest.fixture()
def target_name_2():
"""This fixture returns a target name"""
return TARGET_NAME_2
#pytest.fixture()
def target_name_3():
"""This fixture returns a target name"""
return TARGET_NAME_3
#pytest.fixture()
def target_config_path():
"""This fixture returns the config path"""
return CONFIG_FILE_NAME
Every time I have to add a constant, I have to add a fixture. Also, this increase the number of parameters the tests functions will receive (if in this case, I could use the autouse parameter, for some other fixtures that actually execute code, I do not necessary want to auto-use them as they could prevent other test cases from working).
I am looking for a way to simplify this code, would you have a good pattern/implementation to suggest ?

How do I get django "Data too long for column '<column>' at row" errors to print the actual value?

I have a Django application. Sometimes in production I get an error when uploading data that one of the values is too long. It would be very helpful for debugging if I could see which value was the one that went over the limit. Can I configure this somehow? I'm using MySQL.
It would also be nice if I could enable/disable this on a per-model or column basis so that I don't leak user data to error logs.
When creating model instances from outside sources, one must take care to validate the input or have other guarantees that this data cannot violate constraints.
When not calling at least full_clean() on the model, but directly calling save, one bypasses Django's validators and will only get alerted to the problem by the database driver at which point it's harder to obtain diagnostics:
class JsonImportManager(models.Manager):
def import(self, json_string: str) -> int:
data_list = json.loads(json_string) # list of objects => list of dicts
failed = 0
for data in data_list:
obj = self.model(**data)
try:
obj.full_clean()
except ValidationError as e:
print(e.message_dict) # or use better formatting function
failed += 1
else:
obj.save()
return failed
This is of course very simple, but it's a good boilerplate to get started with.

Multiplayer game using pygame [duplicate]

We are working on a Top-Down-RPG-like Multiplayer game for learning purposes (and fun!) with some friends. We already have some Entities in the Game and Inputs are working, but the network implementation gives us headache :D
The Issues
When trying to convert with dict some values will still contain the pygame.Surface, which I dont want to transfer and it causes errors when trying to jsonfy them. Other objects I would like to transfer in a simplyfied way like Rectangle cannot be converted automatically.
Already functional
Client-Server connection
Transfering JSON objects in both directions
Async networking and synchronized putting into a Queue
Situation
A new player connects to the server and wants to get the current game state with all objects.
Data-Structure
We use a "Entity-Component" based architecture, so we separated the game logic very strictly into "systems", while the data is stored in the "components" of each Entity. The Entity is a very simple container and has nothing more than a ID and a list of components. Example Entity (shorten for better readability):
Entity
|-- Component (Moveable)
|-- Component (Graphic)
| |- complex datatypes like pygame.SURFACE
| `- (...)
`- Component (Inventory)
We tried different approaches, but all seems not to fit very well or feel "hacky".
pickle
Very Python near, so not easy to implement other clients in future. And I´ve read about some security risks when creating items from network in this dynamic way how pickle it offers. It does not even solve the Surface/Rectangle issue.
__dict__
Still contains the reference to the old objects, so a "cleanup" or "filter" for unwanted datatypes happens also in the origin. A deepcopy throws Exception.
...\Python\Python36\lib\copy.py", line 169, in deepcopy
rv = reductor(4)
TypeError: can't pickle pygame.Surface objects
Show some code
The method of the "EnitityManager" Class which should generate the Snapshot of all Entities, including their components. This Snapshot should be converted to JSON without any errors - and if possible without much configuration in this core-class.
class EnitityManager:
def generate_world_snapshot(self):
""" Returns a dictionary with all Entities and their components to send
this to the client. This function will probably generate a lot of data,
but, its to send the whole current game state when a new player
connects or when a complete refresh is required """
# It should be possible to add more objects to the snapshot, so we
# create our own Snapshot-Datastructure
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = deepcopy(e.__dict__)
# Components are Objects, but dictionary is required for transfer
cmp_obj_list = result['entities'][e.id]['components']
# Empty the current list of components, its going to be filled with
# dictionaries of each cmp which are cleaned for the dump, because
# of the errors directly coverting the whole datastructure to JSON
result['entities'][e.id]['components'] = {}
for cmp in cmp_obj_list:
cmp_copy = deepcopy(cmp)
cmp_dict = cmp_copy.__dict__
# Only list, dict, int, str, float and None will stay, while
# other Types are being simply deleted including their key
# Lists and directories will be cleaned ob recursive as well
cmp_dict = self.clean_complex_recursive(cmp_dict)
result['entities'][e.id]['components'][type(cmp_copy).__name__] \
= cmp_dict
logging.debug("EntityMgr: Entity#3: %s" % result['entities'][3])
return result
Expectation and actual results
We can find a way to manually override elements which we dont want. But as the list of components will increase we have to put all the filter logic into this core class, which should not contain any components specializations.
Do we really have to put all the logic into the EntityManager for filtering the right objects? This does not feel good, as I would like to have all convertion to JSON done without any hardcoded configuration.
How to convert all this complex data in a most generic approach?
Thanks for reading so far and thank you very much for your help in advance!
Interesting articles which we were already working threw and maybe helpful for others with similar issues
https://gafferongames.com/post/what_every_programmer_needs_to_know_about_game_networking/
http://code.activestate.com/recipes/408859/
https://docs.python.org/3/library/pickle.html
UPDATE: Solution - thx 2 sloth
We used a combination of the following architecture, which works really great so far and is also good to maintain!
Entity Manager now calls the get_state() function of the entity.
class EntitiyManager:
def generate_world_snapshot(self):
""" Returns a dictionary with all Entities and their components to send
this to the client. This function will probably generate a lot of data,
but, its to send the whole current game state when a new player
connects or when a complete refresh is required """
# It should be possible to add more objects to the snapshot, so we
# create our own Snapshot-Datastructure
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = e.get_state()
return result
The Entity has only some basic attributes to add to the state and forwards the get_state() call to all the Components:
class Entity:
def get_state(self):
state = {'name': self.name, 'id': self.id, 'components': {}}
for cmp in self.components:
state['components'][type(cmp).__name__] = cmp.get_state()
return state
The components itself now inherit their get_state() method from their new superclass components, which simply cares about all simple datatypes:
class Component:
def __init__(self):
logging.debug('generic component created')
def get_state(self):
state = {}
for attr, value in self.__dict__.items():
if value is None or isinstance(value, (str, int, float, bool)):
state[attr] = value
elif isinstance(value, (list, dict)):
# logging.warn("Generating state: not supporting lists yet")
pass
return state
class GraphicComponent(Component):
# (...)
Now every developer has the opportunity to overlay this function to create a more detailed get_state() function for complex types directly in the Component Classes (like Graphic, Movement, Inventory, etc.) if it is required to safe the state in a more accurate way - which is a huge thing for maintaining the code in future, to have these code pieces in one Class.
Next step is to implement the static method for creating the items from the state in the same Class. This makes this working really smooth.
Thank you so much sloth for your help.
Do we really have to put all the logic into the EntityManager for filtering the right objects?
No, you should use polymorphism.
You need a way to represent your game state in a form that can be shared between different systems; so maybe give your components a method that will return all of their state, and a factory method that allows you create the component instances out of that very state.
(Python already has the __repr__ magic method, but you don't have to use it)
So instead of doing all the filtering in the entity manager, just let him call this new method on all components and let each component decide that the result will look like.
Something like this:
...
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = {'components': {}}
for cmp in e.components:
result['entities'][e.id]['components'][type(cmp).__name__] = cmp.get_state()
...
And a component could implement it like this:
class GraphicComponent:
def __init__(self, pos=...):
self.image = ...
self.rect = ...
self.whatever = ...
def get_state(self):
return { 'pos_x': self.rect.x, 'pos_y': self.rect.y, 'image': 'name_of_image.jpg' }
#staticmethod
def from_state(state):
return GraphicComponent(pos=(state.pos_x, state.pos_y), ...)
And a client's EntityManager that recieves the state from the server would iterate for the component list of each entity and call from_state to create the instances.

accessing a variable outside a Requesthandler

i'm using CSS3 accordion effect, and i want to detect if a hacker will
make a script to make a parallel request; ie:
i've a login form and a registration form in the same page, but only
one is visible because there is a CSS3: to access the page, the user
agent must be HTML5 compatible.
the tip i use is:
class Register(tornado.web.RequestHandler):
def post(self):
tt = self.get_argument("_xsrf") + str(time.time())
rtime = float(tt.replace(self.get_argument("_xsrf"), ""))
print rtime
class LoginHandler(BaseHandler):
def post(self):
tt = self.get_argument("_xsrf") + str(time.time())
ltime = float(tt.replace(self.get_argument("_xsrf"), ""))
print ltime
i've used the xsrf variable because it's unique for every user, to
avoid making the server think that the request is coming from the same
machine.
now what i want: how to make the difference between time values:
abs(ltime - rtime) ; mean, how do i access to rtime outside the class,
i just know how to access the value outside the method, i want to make
this operation to detect if the value is small, then the user is using
a script to make a parallel request to kill the server!
in other words (for general python users)
if i have:
class Product:
def info(self):
self.price = 1000
def show(self):
print self.price
>>> car = Product()
>>> car.info()
>>> car.show()
1000
but what if i've another
class User:
pass
then how do make a method that prints me the self.price, i've tried
inheritance, but got error: AttributeError: User instance has no
attribute 'price', so only methods are passed, not attributs?
It sounds like you need to understand Model objects and patterns that use persistant storage of data. tornado.web.RequestHandler and any object that you subclass from it only exists for the duration of your request. From when the URL is received on the server to when data is sent back to the browser via a self.write() or self.finish().
I would recommend you look at some of the Django or Flask tutorials for some basic ideas of how to build a MVC application in Python (There is no Tornado Tutorials that cover this that I know of).