New:
I am now working on a calculator for the terminal. I'm having the same error as before. but this time its telling me this: <function div at 0x7faf88c9e488>
code is at : https://github.com/mishiki1002/Solo-Calculater/blob/master/Solo-Main.py
Solved:
I have been programming in python for Year and a half. I have played around with the def function and variables. My current programming project is a RPG Fantasy type game. I am currently still at the begging and i am wondering what kind of output this is and why i am getting it. I believe it is some kind of binary bit.
<function showInstructions at 0x7fa8933162f0>
when i compile my code through my terminal using python main.py that is what i get. This is my full source code for the game. Please leave any comments and or suggestions for my programming. Also please tell me if my code is sloppy. I try and document all i can so its easier for me and programmers around me to read it.
Sincerely Josh
!/usr/bin/python3
#We will import this so that we have a random in counter chance for the game.
#this will all so prove use full when we have attacks sequences and Loot drops
from random import randint
class Character:
def __init__(self):
#This is the name that you have made for your character
self.name = ' '
#Amount of health for you character
self.health = 100
#Max amount of health for your character
self.health_max = 9999
#Damage class to define the attack methods needed to Battle?
def do_damage(self, enemy):
#Defining a varible called damage. Stating that the minimum damage can be 0. And No less than that, for either
damage = min(
# Stating that the minimum damage can be 0. And No less than that, for either character
max(radint(0, self.health)) - randint(0, enemy.health), 0), (enemy.health)
#Actual Damage System
enemy.health = enemy.health - damage
#Printing the product to the user/gamer
if damamge ==0: "%s evades %s's attack." % (enemy.name , self.name)
#If you land a hit
else: print("%s hurts %s!" % (self.name, enemy.name))
#printing your enemys health so you know how much left un-till 0
return enemy.health <= 0
class Enemy(Character):
def __init__(self, player):
#Since we declared that the enemy is a character it takes a characters paramaters
Character.__init__(self)
#It needs a name Like a Goblin
self.name = 'a Goblin'
#And it needs health
self.health = randint(1, 15 + player.health)
class Player(Character):
#Now we are delcaring a Side characters role
def __init__(self, player):
Character.__init__(self)
#when you check out the menu to see how your characters are doing you will see a meesage saying Normal and
#the amount of health they have
self.state = 'Normal'
#Now we set their health
self.health = 100
#Max Health
self.health_max = 9999
#Start Menu will be here
def showInstructions():
print("Sound_Soul")
print("----------")
print("Commmands:")
print(" 'go [direction]'")
print("The Cardinal Directions:")
print(" north")
print(" east")
print(" south")
print(" west")
print (showInstructions)
You are getting that because you are using only function name showInstructions instead of a call to the function like so showInstructions().
Edit:
One more thing: you don't need to also use print when calling your function as you already have print statements within your function. This causes "None" to be printed below your intended text because the function indeed return None by default.
Related
I have read some papers that use something called "Bootstrapped Cross Entropy Loss" to train their segmentation network. The idea is to focus only on the hardest k% (say 15%) of the pixels into account to improve learning performance, especially when easy pixels dominate.
Currently, I am using the standard cross entropy:
loss = F.binary_cross_entropy(mask, gt)
How do I convert this to the bootstrapped version efficiently in PyTorch?
Often we would also add a "warm-up" period to the loss such that the network can learn to adapt to the easy regions first and transit to the harder regions.
This implementation starts from k=100 and continues for 20000 iterations, then linearly decay it to k=15 for another 50000 iterations.
class BootstrappedCE(nn.Module):
def __init__(self, start_warm=20000, end_warm=70000, top_p=0.15):
super().__init__()
self.start_warm = start_warm
self.end_warm = end_warm
self.top_p = top_p
def forward(self, input, target, it):
if it < self.start_warm:
return F.cross_entropy(input, target), 1.0
raw_loss = F.cross_entropy(input, target, reduction='none').view(-1)
num_pixels = raw_loss.numel()
if it > self.end_warm:
this_p = self.top_p
else:
this_p = self.top_p + (1-self.top_p)*((self.end_warm-it)/(self.end_warm-self.start_warm))
loss, _ = torch.topk(raw_loss, int(num_pixels * this_p), sorted=False)
return loss.mean(), this_p
Addition to self answer by #hkchengrex (for future self and API parity with PyTorch);
one could implement functional version first (with some additional arguments provided in original torch.nn.functional.cross_entropy) like this (also I prefer reduction to be callable instead of predefined strings):
import typing
import torch
def bootstrapped_cross_entropy(
inputs,
targets,
iteration,
p: float,
warmup: typing.Union[typing.Callable[[float, int], float], int] = -1,
weight=None,
ignore_index=-100,
reduction: typing.Callable[[torch.Tensor], torch.Tensor] = torch.mean,
):
if not 0 < p < 1:
raise ValueError("p should be in [0, 1] range, got: {}".format(p))
if isinstance(warmup, int):
this_p = 1.0 if iteration < warmup else p
elif callable(warmup):
this_p = warmup(p, iteration)
else:
raise ValueError(
"warmup should be int or callable, got {}".format(type(warmup))
)
# Shortcut
if this_p == 1.0:
return torch.nn.functional.cross_entropy(
inputs, targets, weight, ignore_index=ignore_index, reduction=reduction
)
raw_loss = torch.nn.functional.cross_entropy(
inputs, targets, weight=weight, ignore_index=ignore_index, reduction="none"
).view(-1)
num_pixels = raw_loss.numel()
loss, _ = torch.topk(raw_loss, int(num_pixels * this_p), sorted=False)
return reduction(loss)
Also warmup can be specified as callable (taking p and current iteration) or int which allows for flexible or easy scheduling.
And making a class basing of _WeightedLoss and iteration incremented automatically during each call (so only inputs and targets have to be passed):
class BoostrappedCrossEntropy(torch.nn.modules.loss._WeightedLoss):
def __init__(
self,
p: float,
warmup: typing.Union[typing.Callable[[float, int], float], int] = -1,
weight=None,
ignore_index=-100,
reduction: typing.Callable[[torch.Tensor], torch.Tensor] = torch.mean,
):
self.p = p
self.warmup = warmup
self.ignore_index = ignore_index
self._current_iteration = -1
super().__init__(weight, size_average=None, reduce=None, reduction=reduction)
def forward(self, inputs, targets):
self._current_iteration += 1
return bootstrapped_cross_entropy(
inputs,
targets,
self._current_iteration,
self.p,
self.warmup,
self.weight,
self.ignore_index,
self.reduction,
)
this is the code in Python, I really don't know how to do this, I am just a beginner and someone can understand my question and help me
def get_float(prompt, low, high):
while True:
prompt = input("Enter monthly investment:")
number= float(input(prompt))
if number > low or number <= high:
is_valid = True
return number
else:
print("Entry must be greater than {low}and less than or equal to {high}")
def main():
get_float(prompt,low,high)
if __name__ == "__main__":
main()
In main, you are passing in the prompt variable into get_float. However, prompt is not defined in main, therefore you are attempting to pass an undefined variable which is not allowed.
In fact, given that get_float reads the prompt from input (and not the value passed in), you do not need to pass prompt into get_float, and prompt can be removed from the function signature.
You cannot pass prompt as a Function argument cause you are reading the value inside the function
def get_float(low, high):
while True:
prompt = input("Enter monthly investment:")
number= float(input(prompt))
if number > low or number <= high:
is_valid = True
return number
else:
print("Entry must be greater than {low}and less than or equal to {high}")
def main():
get_float(200,1000)
if __name__== "__main__":
main()
I have started a python class and my book does not seem to help me.
My professor has a program that bombards my code with different inputs and if any of the inputs do not work then my code is "wrong". I have done many days worth of editing and am at a complete loss. I have the code working if someone puts and input of an actual number. But where my code fails the test is if input is "miles_to_laps(26)" it errors out.
I have tried changing the input to int(input()) but that does not fix the issue. I've gone through changing variables and even changing the input method but still am at a loss. I have already tried contacting my teacher but 6 days of no response and 3 days of being late i feel like I'm just going no where.
user_miles = int(input())
def miles_to_laps(user_miles):
x = user_miles
y = 4
x2 = x * y
result = print('%0.2f' % float(x2))
return result
miles_to_laps(user_miles)
my code works for real number inputs but my professor is wanting inputs like
miles_to_laps(26) and miles_to_laps(13) to create the same outputs.
For the wierd input functionality you can try:
import re
def parse_function_text(s):
try:
return re.search("miles_to_laps\((.+)\)", s)[1]
except TypeError:
return None
def accept_input(user_input):
desugar = parse_function_text(user_input)
if desugar is not None:
user_input = desugar
try:
return float(user_input)
except ValueError:
raise ValueError("Cannot process input %s" % user_input)
assert accept_input("miles_to_laps(3.5)") == 3.5
I'm trying to keep all the pedantism aside, but what kind of CS/programming teaching is that?
Areas of concern:
separate user input from rest of code
separate output formatting from function output
the code inside miles_to_laps is excessive
Now here is the code to try:
LAPS_PER_MILE = 4
# the only calculation, "pure" function
def miles_to_laps(miles):
return LAPS_PER_MILE * miles
# sorting out valid vs invalid input, "interface"
def accept_input(user_input):
try:
return float(user_input)
except ValueError:
raise ValueError("Cannot process input %s" % user_input)
if __name__ == "__main__":
# running the program
laps = miles_to_laps(accept_input(input()))
print ('%0.2f' % laps)
Hope this is not too overwhelming.
Update: second attempt
MILE = 1609.34 # meters per mile
LAP = 400 # track lap
LAPS_PER_MILE = MILE/LAP
def miles_to_laps(miles):
return LAPS_PER_MILE * miles
def laps_coerced(laps):
return '%0.2f' % laps
def accept_input(user_input):
try:
return float(user_input)
except ValueError:
raise ValueError("Cannot process input %s" % user_input)
def main(user_input_str):
miles = accept_input(user_input_str)
laps = miles_to_laps(miles)
print (laps_coerced(laps))
if __name__ == "__main__":
main(input())
I am trying to implement q-learning with an action-value approximation-function. I am using openai-gym and the "MountainCar-v0" enviroment to test my algorithm out. My problem is, it does not converge or find the goal at all.
Basically the approximator works like the following, you feed in the 2 features: position and velocity and one of the 3 actions in a one-hot encoding: 0 -> [1,0,0], 1 -> [0,1,0] and 2 -> [0,0,1]. The output is the action-value approximation Q_approx(s,a), for one specific action.
I know that usually, the input is the state (2 features) and the output layer contains 1 output for each action. The big difference that I see is that I have run the feed forward pass 3 times (one for each action) and take the max, while in the standard implementation you run it once and take the max over the output.
Maybe my implementation is just completely wrong and I am thinking wrong. Gonna paste the code here, it is a mess but I am just experimenting a bit:
import gym
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation
env = gym.make('MountainCar-v0')
# The mean reward over 20 episodes
mean_rewards = np.zeros(20)
# Feature numpy holder
features = np.zeros(5)
# Q_a value holder
qa_vals = np.zeros(3)
one_hot = {
0 : np.asarray([1,0,0]),
1 : np.asarray([0,1,0]),
2 : np.asarray([0,0,1])
}
model = Sequential()
model.add(Dense(20, activation="relu",input_dim=(5)))
model.add(Dense(10,activation="relu"))
model.add(Dense(1))
model.compile(optimizer='rmsprop',
loss='mse',
metrics=['accuracy'])
epsilon_greedy = 0.1
discount = 0.9
batch_size = 16
# Experience replay containing features and target
experience = np.ones((10*300,5+1))
# Ring buffer
def add_exp(features,target,index):
if index % experience.shape[0] == 0:
index = 0
global filled_once
filled_once = True
experience[index,0:5] = features
experience[index,5] = target
index += 1
return index
for e in range(0,100000):
obs = env.reset()
old_obs = None
new_obs = obs
rewards = 0
loss = 0
for i in range(0,300):
if old_obs is not None:
# Find q_a max for s_(t+1)
features[0:2] = new_obs
for i,pa in enumerate([0,1,2]):
features[2:5] = one_hot[pa]
qa_vals[i] = model.predict(features.reshape(-1,5))
rewards += reward
target = reward + discount*np.max(qa_vals)
features[0:2] = old_obs
features[2:5] = one_hot[a]
fill_index = add_exp(features,target,fill_index)
# Find new action
if np.random.random() < epsilon_greedy:
a = env.action_space.sample()
else:
a = np.argmax(qa_vals)
else:
a = env.action_space.sample()
obs, reward, done, info = env.step(a)
old_obs = new_obs
new_obs = obs
if done:
break
if filled_once:
samples_ids = np.random.choice(experience.shape[0],batch_size)
loss += model.train_on_batch(experience[samples_ids,0:5],experience[samples_ids,5].reshape(-1))[0]
mean_rewards[e%20] = rewards
print("e = {} and loss = {}".format(e,loss))
if e % 50 == 0:
print("e = {} and mean = {}".format(e,mean_rewards.mean()))
Thanks in advance!
There shouldn't be much difference between the actions as inputs to your network or as different outputs of your network. It does make a huge difference if your states are images for example. because Conv nets work very well with images and there would be no obvious way of integrating the actions to the input.
Have you tried the cartpole balancing environment? It is better to test if your model is working correctly.
Mountain climb is pretty hard. It has no reward until you reach the top, which often doesn't happen at all. The model will only start learning something useful once you get to the top once. If you are never getting to the top you should probably increase your time doing exploration. in other words take more random actions, a lot more...
i was implementing DQN in mountain car problem of openai gym. this problem is special as the positive reward is very sparse. so i thought of implementing prioritized experience replay as proposed in this paper by google deep mind.
there are certain things that are confusing me:
how do we store the replay memory. i get that pi is the priority of transition and there are two ways but what is this P(i)?
if we follow the rules given won't P(i) change every time a sample is added.
what does it mean when it says "we sample according to this probability distribution". what is the distribution.
finally how do we sample from it. i get that if we store it in a priority queue we can sample directly but we are actually storing it in a sum tree.
thanks in advance
According to the paper, there are two ways for calculating Pi and base on your choice, your implementation differs. I assume you selected Proportional Prioriziation then you should use "sum-tree" data structure for storing a pair of transition and P(i). P(i) is just the normalized version of Pi and it shows how important that transition is or in other words how effective that transition is for improving your network. When P(i) is high, it means it's so surprising for the network so it can really help the network to tune itself.
You should add each new transition with infinity priority to make sure it will be played at least once and there is no need to update all the experience replay memory for each new coming transition. During the experience replay process, you select a mini-batch and update the probability of those experiences in the mini-batch.
Each experience has a probability so all of the experiences together make a distribution and we select our next mini-batch according to this distribution.
You can sample via this policy from your sum-tree:
def retrieve(n, s):
if n is leaf_node: return n
if n.left.val >= s: return retrieve(n.left, s)
else: return retrieve(n.right, s - n.left.val)
I have taken the code from here.
You can reuse the code in OpenAI Baseline or using SumTree
import numpy as np
import random
from baselines.common.segment_tree import SumSegmentTree, MinSegmentTree
class ReplayBuffer(object):
def __init__(self, size):
"""Create Replay buffer.
Parameters
----------
size: int
Max number of transitions to store in the buffer. When the buffer
overflows the old memories are dropped.
"""
self._storage = []
self._maxsize = size
self._next_idx = 0
def __len__(self):
return len(self._storage)
def add(self, obs_t, action, reward, obs_tp1, done):
data = (obs_t, action, reward, obs_tp1, done)
if self._next_idx >= len(self._storage):
self._storage.append(data)
else:
self._storage[self._next_idx] = data
self._next_idx = (self._next_idx + 1) % self._maxsize
def _encode_sample(self, idxes):
obses_t, actions, rewards, obses_tp1, dones = [], [], [], [], []
for i in idxes:
data = self._storage[i]
obs_t, action, reward, obs_tp1, done = data
obses_t.append(np.array(obs_t, copy=False))
actions.append(np.array(action, copy=False))
rewards.append(reward)
obses_tp1.append(np.array(obs_tp1, copy=False))
dones.append(done)
return np.array(obses_t), np.array(actions), np.array(rewards), np.array(obses_tp1), np.array(dones)
def sample(self, batch_size):
"""Sample a batch of experiences.
Parameters
----------
batch_size: int
How many transitions to sample.
Returns
-------
obs_batch: np.array
batch of observations
act_batch: np.array
batch of actions executed given obs_batch
rew_batch: np.array
rewards received as results of executing act_batch
next_obs_batch: np.array
next set of observations seen after executing act_batch
done_mask: np.array
done_mask[i] = 1 if executing act_batch[i] resulted in
the end of an episode and 0 otherwise.
"""
idxes = [random.randint(0, len(self._storage) - 1) for _ in range(batch_size)]
return self._encode_sample(idxes)
class PrioritizedReplayBuffer(ReplayBuffer):
def __init__(self, size, alpha):
"""Create Prioritized Replay buffer.
Parameters
----------
size: int
Max number of transitions to store in the buffer. When the buffer
overflows the old memories are dropped.
alpha: float
how much prioritization is used
(0 - no prioritization, 1 - full prioritization)
See Also
--------
ReplayBuffer.__init__
"""
super(PrioritizedReplayBuffer, self).__init__(size)
assert alpha >= 0
self._alpha = alpha
it_capacity = 1
while it_capacity < size:
it_capacity *= 2
self._it_sum = SumSegmentTree(it_capacity)
self._it_min = MinSegmentTree(it_capacity)
self._max_priority = 1.0
def add(self, *args, **kwargs):
"""See ReplayBuffer.store_effect"""
idx = self._next_idx
super().add(*args, **kwargs)
self._it_sum[idx] = self._max_priority ** self._alpha
self._it_min[idx] = self._max_priority ** self._alpha
def _sample_proportional(self, batch_size):
res = []
p_total = self._it_sum.sum(0, len(self._storage) - 1)
every_range_len = p_total / batch_size
for i in range(batch_size):
mass = random.random() * every_range_len + i * every_range_len
idx = self._it_sum.find_prefixsum_idx(mass)
res.append(idx)
return res
def sample(self, batch_size, beta):
"""Sample a batch of experiences.
compared to ReplayBuffer.sample
it also returns importance weights and idxes
of sampled experiences.
Parameters
----------
batch_size: int
How many transitions to sample.
beta: float
To what degree to use importance weights
(0 - no corrections, 1 - full correction)
Returns
-------
obs_batch: np.array
batch of observations
act_batch: np.array
batch of actions executed given obs_batch
rew_batch: np.array
rewards received as results of executing act_batch
next_obs_batch: np.array
next set of observations seen after executing act_batch
done_mask: np.array
done_mask[i] = 1 if executing act_batch[i] resulted in
the end of an episode and 0 otherwise.
weights: np.array
Array of shape (batch_size,) and dtype np.float32
denoting importance weight of each sampled transition
idxes: np.array
Array of shape (batch_size,) and dtype np.int32
idexes in buffer of sampled experiences
"""
assert beta > 0
idxes = self._sample_proportional(batch_size)
weights = []
p_min = self._it_min.min() / self._it_sum.sum()
max_weight = (p_min * len(self._storage)) ** (-beta)
for idx in idxes:
p_sample = self._it_sum[idx] / self._it_sum.sum()
weight = (p_sample * len(self._storage)) ** (-beta)
weights.append(weight / max_weight)
weights = np.array(weights)
encoded_sample = self._encode_sample(idxes)
return tuple(list(encoded_sample) + [weights, idxes])
def update_priorities(self, idxes, priorities):
"""Update priorities of sampled transitions.
sets priority of transition at index idxes[i] in buffer
to priorities[i].
Parameters
----------
idxes: [int]
List of idxes of sampled transitions
priorities: [float]
List of updated priorities corresponding to
transitions at the sampled idxes denoted by
variable `idxes`.
"""
assert len(idxes) == len(priorities)
for idx, priority in zip(idxes, priorities):
assert priority > 0
assert 0 <= idx < len(self._storage)
self._it_sum[idx] = priority ** self._alpha
self._it_min[idx] = priority ** self._alpha
self._max_priority = max(self._max_priority, priority)