how to solve module 'gym.wrappers' has no attribute 'Monitor'？

how to solve module 'gym.wrappers' has no attribute 'Monitor'？ - reinforcement-learning

import gym
if __name__ == "__main__":
env = gym.make("CartPole-v0")
env = gym.wrappers.Monitor(env, "recording")
total_reward = 0.0
total_steps = 0
obs = env.reset()
while True:
action = env.action_space.sample()
obs, reward, done, _ = env.step(action)
total_reward += reward
total_steps += 1
if done:
break
print("Episode done in %d steps, total reward %.2f" % (
total_steps, total_reward))
env.close()
env.env.close()
these codes are from ：Maxim Lapan. Deep Reinforcement Learning Hands-On
when i run these code,i get this:'gym.wrappers' has no attribute 'Monitor'
i try to search on google to find answer,but i still have no idea about the way to solve the question.

it looks like they've changed the API and removed the 'Monitor'-wrapper (https://github.com/openai/gym/releases/tag/0.23.0).
You could either use the version of 'gym' from the book (0.15.3) (e.g. pip install gym==0.15.3) or you could use the render()-method of the updated environment class (from v0.23.0).
In the latter case you'd need to include env.render() into the while-loop:
while True:
action = env.action_space.sample()
obs, reward, done, _ = env.step(action)
total_reward += reward
total_steps += 1
env.render()
if done:
break

Use this
from gym.wrappers.monitoring.video_recorder import VideoRecorder
how to use it can be found here
https://www.anyscale.com/blog/an-introduction-to-reinforcement-learning-with-openai-gym-rllib-and-google

Related

Deep Q Learning - Cartpole Environment

I have a concern in understanding the Cartpole code as an example for Deep Q Learning. The DQL Agent part of the code as follow:
class DQLAgent:
def __init__(self, env):
# parameter / hyperparameter
self.state_size = env.observation_space.shape[0]
self.action_size = env.action_space.n
self.gamma = 0.95
self.learning_rate = 0.001
self.epsilon = 1 # explore
self.epsilon_decay = 0.995
self.epsilon_min = 0.01
self.memory = deque(maxlen = 1000)
self.model = self.build_model()
def build_model(self):
# neural network for deep q learning
model = Sequential()
model.add(Dense(48, input_dim = self.state_size, activation = "tanh"))
model.add(Dense(self.action_size,activation = "linear"))
model.compile(loss = "mse", optimizer = Adam(lr = self.learning_rate))
return model
def remember(self, state, action, reward, next_state, done):
# storage
self.memory.append((state, action, reward, next_state, done))
def act(self, state):
# acting: explore or exploit
if random.uniform(0,1) <= self.epsilon:
return env.action_space.sample()
else:
act_values = self.model.predict(state)
return np.argmax(act_values[0])
def replay(self, batch_size):
# training
if len(self.memory) < batch_size:
return
minibatch = random.sample(self.memory,batch_size)
for state, action, reward, next_state, done in minibatch:
if done:
target = reward
else:
target = reward + self.gamma*np.amax(self.model.predict(next_state)[0])
train_target = self.model.predict(state)
train_target[0][action] = target
self.model.fit(state,train_target, verbose = 0)
def adaptiveEGreedy(self):
if self.epsilon > self.epsilon_min:
self.epsilon *= self.epsilon_decay
In the training section, we found our target and train_target. So why did we set train_target[0][action] = target here?
Every predict made while learning is not correct, but thanks to error calculation and backpropagation, the predict made at the end of the network will get closer and closer, but when we make train_target[0][action] = target here the error becomes 0, and in this case, how will the learning be?

self.model.predict(state) will return a tensor of shape of (1, 2) containing the estimated Q values for each action (in cartpole the action space is {0,1}).
As you know the Q value is a measure of the expected reward.
By setting self.model.predict(state)[0][action] = target (where target is the expected sum of rewards) it is creating a target Q value on which to train the model. By then calling model.fit(state, train_target) it is using the target Q value to train said model to approximate better Q values for each state.
I don't understand why you are saying that the loss becomes 0: the target is set to the discounted sum of rewards plus the current reward
target = reward + self.gamma*np.amax(self.model.predict(next_state)[0])
while the network prediction for the highest Q value is
np.amax(self.model.predict(next_state)[0])
The loss between the target and the predicted values is what is used to train the model.
Edit - more detailed explaination
(you can ignore the [0] to the predicted values, as it is just to access the right column and unimportant in the understanding)
The target variable is set to the sum between the current reward and the estimated sum of future rewards, or the Q value. Note that this variable is called target but it is not the target of the network, but the target Q value for the chosen action.
The train_target variable is used as what you call the "dataset". It represents the target of the network.
train_target = self.model.predict(state)
train_target[0][action] = target
You can clearly see that:
train_target[<taken action>] = reward + self.gamma*np.amax(self.model.predict(next_state)[0])
train_target[<any other action>] = <prediction from the model>
the loss (mean squared error):
prediction = self.model.predict(state)
loss = (train_target - prediction)^2
For any line of the that is not the the loss is 0. For the one line that has been set the loss is
(target - prediction[action])^2
or
((reward + self.gamma*np.amax(self.model.predict(next_state)[0])) - self.model.predict(state)[0][action])^2
which is clearly different from 0.
Note that this agent is not ideal. I would strongly recommend the use of a target model instead of creating target Q values that way. Check out this answer as for why.

Google Cloud Functions - How to set up a function (trading bot)

I would like to set up a trading bot via Google Cloud to run around the clock.
In Google Cloud Functions I use the Inline editor with runtime Python 3.7.
I have two questions:
1) Main.py section: Here I copied the full code of my Python script (Trading Bot) - see code below for reference (which works well when run as a script in my IDE Spyder).
However, below Google asks to provide a function to execute. However, my code is just a script with no main function. Can I just put at the top of the code e.g.: "def trading_bot(self):" and indent the remaining part below?
While the code as a script copied below works well, if I add the "def trading_bot(self):" at the top in my IDE (Spyder), the code doesnt seem to work properly...How can I make sure the code within the function runs properly, when I call the function from Google Cloud (or from my IDE).
2) Requirements.txt section: Can you provide guidance what exactly I need to put there, i.e. can I look up the dependencies used in my code somewhere? I use Anaconda for distribution, the classes imported for the script are at the top of the script provided below.
Thanks for any help. Glad also for your advice if you think Google Cloud Functions is not the best approach to run a trading bot but it seemed to me to be the simplest solution.
import bitmex
import json
from time import sleep
from bitmex_websocket import BitMEXWebsocket
import logging, time, requests
import numpy as np
import pandas as pd
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")
from datetime import datetime
import math
from statistics import mean
#-------------------------
#variable
symbol = "XBTUSD"
#standard API connection
api_key = "XXX"
api_secret = "XXX"
#True for testnet
client = bitmex.bitmex(test=False, api_key=api_key, api_secret=api_secret)
#------------------
# Trading algorithm
symbol = "XBTUSD"
ordType = 'Stop'
#starting order quantity
orderQty = 1
leftBars = 6
rightBars = 2
#round to 0.5
def round_to_BTC(n):
result = round(n*2)/2
return result
t=1
while t < 1000000:
time_now = (time.strftime('%H:%M:%S', time.localtime(int(time.time()))))
t_now = time_now[6:8]
t1 = "00"
t2 = "59"
FMT = '%S'
def days_hours_minutes_seconds(td):
return td.days, td.seconds//3600, (td.seconds//60)%60, td.seconds
if t_now == str('00'):
#give 1 second to candlestick to properly close
sleep(1)
elif t_now > str('00') and t_now <= str('59'):
s1 = datetime.strptime(t2, FMT) - datetime.strptime(t_now, FMT)
s1_seconds = days_hours_minutes_seconds(s1)[3]+2
sleep(s1_seconds)
else:
pass
time_now = (time.strftime('%H:%M:%S', time.localtime(int(time.time()))))
print("The time is now: " + time_now)
#most recent swing candles, get highs and lows / #binsizes = {"1m": 1, "5m": 5, "1h": 60, "1d": 1440}
#+1 is the middle bar
totalBars = leftBars + rightBars + 1
swing_candles = client.Trade.Trade_getBucketed(symbol=symbol, binSize="1m", count=totalBars, reverse=True).result()[0]
last_highs = []
last_lows = []
i=0
while i <= (len(swing_candles)-1):
last_highs.append(swing_candles[i]["high"])
last_lows.append(swing_candles[i]["low"])
i += 1
#get the highest high and the lowest low
highest_high = max(last_highs)
lowest_low = min(last_lows)
#check if there are existing positions & orders
if client.Position.Position_get().result()[0] != []:
positions_quantity = client.Position.Position_get().result()[0][0]["currentQty"]
else:
positions_quantity = 0
#check existing orders
buy_orders_quantity = []
sell_orders_quantity = []
orders_quantity = client.Order.Order_getOrders(filter=json.dumps({"open": True})).result()[0]
h=0
while h <= len(orders_quantity)-1:
if orders_quantity[h]["side"] == "Sell":
sell_orders_quantity.append(orders_quantity[h])
elif orders_quantity[h]["side"] == "Buy":
buy_orders_quantity.append(orders_quantity[h])
h += 1
if highest_high == last_highs[rightBars] and positions_quantity == 0:
if buy_orders_quantity == []:
client.Order.Order_new(symbol = symbol, orderQty = orderQty*1, side = "Buy", ordType = 'Stop', stopPx = (highest_high+0.5), execInst ='LastPrice' ).result()
elif buy_orders_quantity != []:
orderID = buy_orders_quantity[0]["orderID"]
client.Order.Order_amend(orderID=orderID, orderQty=orderQty*1, stopPx = (highest_high+0.5)).result()
else:
pass
elif highest_high == last_highs[rightBars] and positions_quantity > 0:
#dont place any additional long
pass
elif highest_high == last_highs[rightBars] and positions_quantity < 0:
if buy_orders_quantity != []:
orderID = buy_orders_quantity[0]["orderID"]
client.Order.Order_amend(orderID=orderID, orderQty=orderQty*2, stopPx = (highest_high+0.5)).result()
else:
client.Order.Order_new(symbol = symbol, orderQty = (orderQty)*2, side = "Buy", ordType = 'Stop', stopPx = (highest_high+0.5), execInst ='LastPrice' ).result()
elif lowest_low == last_lows[rightBars] and positions_quantity == 0:
if sell_orders_quantity == []:
client.Order.Order_new(symbol = symbol, orderQty = (orderQty)*-1, side = "Sell", ordType = 'Stop', stopPx = (lowest_low-0.5), execInst ='LastPrice' ).result()
elif sell_orders_quantity != []:
orderID = sell_orders_quantity[0]["orderID"]
client.Order.Order_amend(orderID=orderID, orderQty=orderQty*-1, stopPx = (lowest_low-0.5)).result()
else:
pass
elif lowest_low == last_lows[rightBars] and positions_quantity < 0:
#dont place any additional shorts
pass
elif lowest_low == last_lows[rightBars] and positions_quantity > 0:
if sell_orders_quantity != []:
orderID = sell_orders_quantity[0]["orderID"]
client.Order.Order_amend(orderID=orderID, orderQty=orderQty*-2, stopPx = (lowest_low-0.5)).result()
else:
client.Order.Order_new(symbol = symbol, orderQty = (orderQty)*-2, side = "Sell", ordType = 'Stop', stopPx = (lowest_low-0.5), execInst ='LastPrice' ).result()
positions_quantity = client.Position.Position_get().result()[0][0]["currentQty"]
buy_orders_quantity = []
sell_orders_quantity = []
orders_quantity = client.Order.Order_getOrders(filter=json.dumps({"open": True})).result()[0]
h=0
while h <= len(orders_quantity)-1:
if orders_quantity[h]["side"] == "Sell":
sell_orders_quantity.append(orders_quantity[h])
elif orders_quantity[h]["side"] == "Buy":
buy_orders_quantity.append(orders_quantity[h])
h += 1
if positions_quantity > 0:
if sell_orders_quantity != []:
orderID = sell_orders_quantity[0]["orderID"]
client.Order.Order_amend(orderID=orderID, orderQty=orderQty*-2).result()
elif positions_quantity < 0:
if buy_orders_quantity != []:
orderID = buy_orders_quantity[0]["orderID"]
client.Order.Order_amend(orderID=orderID, orderQty=orderQty*2).result()
print("Your current position is " + str(positions_quantity))
print("This is iteration: " + str(t))
t += 1

As concerns my second question, I solved it in the following way:
In the command terminal, type: pip freeze > requirements.txt
The file contains all dependencies.
As concerns question 1 I still dont understand what code exactly needs to be put in the section main.py.
Thanks!

Cloud Functions is not an adequate product for your use case. They are mostly used for lightweight calculations or not high resource consuming methods.
The magic of CF consists in that they execute your code whenever you hit the URL that belongs to it. This is important to understand for your question number 1. If you want your function to work, you need to always create a method that accepts the "request" parameter. As it is the information from the HTTP request made when the URL is hit.
You can take a look at this document for reference.
You function should always start like this
from flask #import your dependencies
def my_awesome_function(request):
#Your logic
In this case you should write "my_awesome_function" on the Function to Execute textbox.
You also have to be careful with your resources, as CF have 5 presentations. They differ in CPU and Memory you can read more about this here.
This, among many reasons, you should not use Cloud Functions for your bot. I could recommend you to use a virtual machine, but activities related to use of the Services for cryptocurrency mining without Google's prior written approval are frowned upon and may result in the deactivation of your product as stated in the terms of service.

Function approximator and q-learning

I am trying to implement q-learning with an action-value approximation-function. I am using openai-gym and the "MountainCar-v0" enviroment to test my algorithm out. My problem is, it does not converge or find the goal at all.
Basically the approximator works like the following, you feed in the 2 features: position and velocity and one of the 3 actions in a one-hot encoding: 0 -> [1,0,0], 1 -> [0,1,0] and 2 -> [0,0,1]. The output is the action-value approximation Q_approx(s,a), for one specific action.
I know that usually, the input is the state (2 features) and the output layer contains 1 output for each action. The big difference that I see is that I have run the feed forward pass 3 times (one for each action) and take the max, while in the standard implementation you run it once and take the max over the output.
Maybe my implementation is just completely wrong and I am thinking wrong. Gonna paste the code here, it is a mess but I am just experimenting a bit:
import gym
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation
env = gym.make('MountainCar-v0')
# The mean reward over 20 episodes
mean_rewards = np.zeros(20)
# Feature numpy holder
features = np.zeros(5)
# Q_a value holder
qa_vals = np.zeros(3)
one_hot = {
0 : np.asarray([1,0,0]),
1 : np.asarray([0,1,0]),
2 : np.asarray([0,0,1])
}
model = Sequential()
model.add(Dense(20, activation="relu",input_dim=(5)))
model.add(Dense(10,activation="relu"))
model.add(Dense(1))
model.compile(optimizer='rmsprop',
loss='mse',
metrics=['accuracy'])
epsilon_greedy = 0.1
discount = 0.9
batch_size = 16
# Experience replay containing features and target
experience = np.ones((10*300,5+1))
# Ring buffer
def add_exp(features,target,index):
if index % experience.shape[0] == 0:
index = 0
global filled_once
filled_once = True
experience[index,0:5] = features
experience[index,5] = target
index += 1
return index
for e in range(0,100000):
obs = env.reset()
old_obs = None
new_obs = obs
rewards = 0
loss = 0
for i in range(0,300):
if old_obs is not None:
# Find q_a max for s_(t+1)
features[0:2] = new_obs
for i,pa in enumerate([0,1,2]):
features[2:5] = one_hot[pa]
qa_vals[i] = model.predict(features.reshape(-1,5))
rewards += reward
target = reward + discount*np.max(qa_vals)
features[0:2] = old_obs
features[2:5] = one_hot[a]
fill_index = add_exp(features,target,fill_index)
# Find new action
if np.random.random() < epsilon_greedy:
a = env.action_space.sample()
else:
a = np.argmax(qa_vals)
else:
a = env.action_space.sample()
obs, reward, done, info = env.step(a)
old_obs = new_obs
new_obs = obs
if done:
break
if filled_once:
samples_ids = np.random.choice(experience.shape[0],batch_size)
loss += model.train_on_batch(experience[samples_ids,0:5],experience[samples_ids,5].reshape(-1))[0]
mean_rewards[e%20] = rewards
print("e = {} and loss = {}".format(e,loss))
if e % 50 == 0:
print("e = {} and mean = {}".format(e,mean_rewards.mean()))
Thanks in advance!

There shouldn't be much difference between the actions as inputs to your network or as different outputs of your network. It does make a huge difference if your states are images for example. because Conv nets work very well with images and there would be no obvious way of integrating the actions to the input.
Have you tried the cartpole balancing environment? It is better to test if your model is working correctly.
Mountain climb is pretty hard. It has no reward until you reach the top, which often doesn't happen at all. The model will only start learning something useful once you get to the top once. If you are never getting to the top you should probably increase your time doing exploration. in other words take more random actions, a lot more...

Is Caffe's reported accuracy reliable?

I recently tried to get the confusion matrix for one of my trained models, to see how precise it is. I downloaded this script and fed my model. To my astonishment, the accuracy calculated by the script, is very different than the one, Caffe reports.
I have used this script to calculate the confusion matrix, this however, reports the accuracy as well, the problem is the accuracy reported by this script is way different that the one reported by Caffe! For example Caffe reports the accuracy lets say for CIFAR10, as 92.34%, while, when the model is fed to the script to calculate confusion matrix and its accuracy, it results in for example something like 86.5%!
Which one of these accuracies are the correct one, and can be reported in papers or compared with the results of other papers such as those here ?
I also saw something weird again, I trained two identical models, with only one difference, that being one used Xavier, and the other used msra for initialization. The first one reports an accuracy of 94.25 and the other reports 94.26 in Caffe. when these models are fed to the script I linked above, for confusion matrix computations. their accuracies were 89.2% and 87.4% respectively!
Is this normal? what is the cause for this? msra?
Are the accuracies reported by Caffe true and reliable?
PS: The accuracy in the script is calculated as (complete script):
for i, image, label in reader:
image_caffe = image.reshape(1, *image.shape)
out = net.forward_all(data=np.asarray([ image_caffe ]))
plabel = int(out['prob'][0].argmax(axis=0))
count += 1
iscorrect = label == plabel
correct += (1 if iscorrect else 0)
matrix[(label, plabel)] += 1
labels_set.update([label, plabel])
if not iscorrect:
print("\rError: i=%s, expected %i but predicted %i" \
% (i, label, plabel))
sys.stdout.write("\rAccuracy: %.1f%%" % (100.*correct/count))
sys.stdout.flush()
print(", %i/%i corrects" % (correct, count))
Which imho is OK and correct. the number of correct predictions, divided by the total number of instances in the dataset.

I found the reason.
The reason for the mismatch between Caffe generated accuracy and the accuract generated by the script in question, was solely because of mean-subtraction, which was done in caffe, and not in the script.
This is the modified version of script which takes this into account and hopefully everything is just fine.
# Author: Axel Angel, copyright 2015, license GPLv3.
# added mean subtraction so that, the accuracy can be reported accurately just like caffe when doing a mean subtraction
# Seyyed Hossein Hasan Pour
# Coderx7#Gmail.com
# 7/3/2016
import sys
import caffe
import numpy as np
import lmdb
import argparse
from collections import defaultdict
def flat_shape(x):
"Returns x without singleton dimension, eg: (1,28,28) -> (28,28)"
return x.reshape(filter(lambda s: s > 1, x.shape))
def lmdb_reader(fpath):
import lmdb
lmdb_env = lmdb.open(fpath)
lmdb_txn = lmdb_env.begin()
lmdb_cursor = lmdb_txn.cursor()
for key, value in lmdb_cursor:
datum = caffe.proto.caffe_pb2.Datum()
datum.ParseFromString(value)
label = int(datum.label)
image = caffe.io.datum_to_array(datum).astype(np.uint8)
yield (key, flat_shape(image), label)
def leveldb_reader(fpath):
import leveldb
db = leveldb.LevelDB(fpath)
for key, value in db.RangeIter():
datum = caffe.proto.caffe_pb2.Datum()
datum.ParseFromString(value)
label = int(datum.label)
image = caffe.io.datum_to_array(datum).astype(np.uint8)
yield (key, flat_shape(image), label)
def npz_reader(fpath):
npz = np.load(fpath)
xs = npz['arr_0']
ls = npz['arr_1']
for i, (x, l) in enumerate(np.array([ xs, ls ]).T):
yield (i, x, l)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--proto', type=str, required=True)
parser.add_argument('--model', type=str, required=True)
parser.add_argument('--mean', type=str, required=True)
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument('--lmdb', type=str, default=None)
group.add_argument('--leveldb', type=str, default=None)
group.add_argument('--npz', type=str, default=None)
args = parser.parse_args()
# Extract mean from the mean image file
mean_blobproto_new = caffe.proto.caffe_pb2.BlobProto()
f = open(args.mean, 'rb')
mean_blobproto_new.ParseFromString(f.read())
mean_image = caffe.io.blobproto_to_array(mean_blobproto_new)
f.close()
count = 0
correct = 0
matrix = defaultdict(int) # (real,pred) -> int
labels_set = set()
# CNN reconstruction and loading the trained weights
net = caffe.Net(args.proto, args.model, caffe.TEST)
caffe.set_mode_cpu()
print "args", vars(args)
if args.lmdb != None:
reader = lmdb_reader(args.lmdb)
if args.leveldb != None:
reader = leveldb_reader(args.leveldb)
if args.npz != None:
reader = npz_reader(args.npz)
for i, image, label in reader:
image_caffe = image.reshape(1, *image.shape)
out = net.forward_all(data=np.asarray([ image_caffe ])- mean_image)
plabel = int(out['prob'][0].argmax(axis=0))
count += 1
iscorrect = label == plabel
correct += (1 if iscorrect else 0)
matrix[(label, plabel)] += 1
labels_set.update([label, plabel])
if not iscorrect:
print("\rError: i=%s, expected %i but predicted %i" \
% (i, label, plabel))
sys.stdout.write("\rAccuracy: %.1f%%" % (100.*correct/count))
sys.stdout.flush()
print(", %i/%i corrects" % (correct, count))
print ""
print "Confusion matrix:"
print "(r , p) | count"
for l in labels_set:
for pl in labels_set:
print "(%i , %i) | %i" % (l, pl, matrix[(l,pl)])

FiPy Simple Convection

I am trying to understand how FiPy works by working an example, in particular I would like to solve the following simple convection equation with periodic boundary:
$$\partial_t u + \partial_x u = 0$$
If initial data is given by $u(x, 0) = F(x)$, then the analytical solution is $u(x, t) = F(x - t)$. I do get a solution, but it is not correct.
What am I missing? Is there a better resource for understanding FiPy than the documentation? It is very sparse...
Here is my attempt
from fipy import *
import numpy as np
# Generate mesh
nx = 20
dx = 2*np.pi/nx
mesh = PeriodicGrid1D(nx=nx, dx=dx)
# Generate solution object with initial discontinuity
phi = CellVariable(name="solution variable", mesh=mesh)
phiAnalytical = CellVariable(name="analytical value", mesh=mesh)
phi.setValue(1.)
phi.setValue(0., where=x > 1.)
# Define the pde
D = [[-1.]]
eq = TransientTerm() == ConvectionTerm(coeff=D)
# Set discretization so analytical solution is exactly one cell translation
dt = 0.01*dx
steps = 2*int(dx/dt)
# Set the analytical value at the end of simulation
phiAnalytical.setValue(np.roll(phi.value, 1))
for step in range(steps):
eq.solve(var=phi, dt=dt)
print(phi.allclose(phiAnalytical, atol=1e-1))

As addressed on the FiPy mailing list, FiPy is not great at handling convection only PDEs (absent diffusion, pure hyperbolic) as it's missing higher order convection schemes. It is better to use CLAWPACK for this class of problem.
FiPy does have one second order scheme that might help with this problem, the VanLeerConvectionTerm, see an example.
If the VanLeerConvectionTerm is used in the above problem, it does do a better job of preserving the shock.
import numpy as np
import fipy
# Generate mesh
nx = 20
dx = 2*np.pi/nx
mesh = fipy.PeriodicGrid1D(nx=nx, dx=dx)
# Generate solution object with initial discontinuity
phi = fipy.CellVariable(name="solution variable", mesh=mesh)
phiAnalytical = fipy.CellVariable(name="analytical value", mesh=mesh)
phi.setValue(1.)
phi.setValue(0., where=mesh.x > 1.)
# Define the pde
D = [[-1.]]
eq = fipy.TransientTerm() == fipy.VanLeerConvectionTerm(coeff=D)
# Set discretization so analytical solution is exactly one cell translation
dt = 0.01*dx
steps = 2*int(dx/dt)
# Set the analytical value at the end of simulation
phiAnalytical.setValue(np.roll(phi.value, 1))
viewer = fipy.Viewer(phi)
for step in range(steps):
eq.solve(var=phi, dt=dt)
viewer.plot()
raw_input('stopped')
print(phi.allclose(phiAnalytical, atol=1e-1))

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

how to solve module 'gym.wrappers' has no attribute 'Monitor'？ - reinforcement-learning

Use this from gym.wrappers.monitoring.video_recorder import VideoRecorder how to use it can be found here https://www.anyscale.com/blog/an-introduction-to-reinforcement-learning-with-openai-gym-rllib-and-google

Related

Deep Q Learning - Cartpole Environment

Google Cloud Functions - How to set up a function (trading bot)

Function approximator and q-learning

Is Caffe's reported accuracy reliable?

FiPy Simple Convection

Categories

Resources