I'm writing a little music quiz game ("guess the note") in Python and am using PyAudio to generate the audio:
def gen_tone(frequency, duration=5.0, volume=0.5, fs=44100):
p = PyAudio()
# Sine wave
# samples = (np.sin(2*np.pi*np.arange(fs*duration)*frequency/fs)).astype(np.float32)
# Sawtooth wave. Can also be done with signal.sawtooth.
# samples = (np.modf(np.arange(fs*duration)*frequency/fs)[0]*2.0-1.0).astype(np.float32)
# DO NOT USE Triangle wave
# t = np.linspace(0, duration, fs*duration)
# samples = signal.sawtooth(2 * np.pi * t * frequency, width=0.5)
stream = p.open(format=paFloat32,
channels=1,
rate=fs,
output=True)
stream.write(volume * samples)
stream.stop_stream()
stream.close()
p.terminate()
The sine wave didn't sound nice, so I tried sawtooth. I wanted to test a triangle wave next. The (commented out) code above works, but when it plays through my earbuds, even at my MacBook's lowest volume, it produces a (quite possibly literally) deafening noise through them. Do not try this at home.
The plot looks reasonable:
What's going on?
Related
I am trying to implement a multitask neural network used by a paper but am quite unsure how I should code the multitask network because the authors did not provide code for that part.
The network architecture looks like (paper):
To make it simpler, the network architecture could be generalized as (For demo I changed their more complicated operation for the pair of individual embeddings to concatenation):
The authors are summing the loss from the individual tasks and the pairwise tasks, and using the total loss to optimize the parameters for the three networks (encoder, MLP-1, MLP-2) in each batch, but I am kind of at sea as to how different types of data are combined in a single batch to feed into two different networks that share an initial encoder. I tried to search for other networks with similar structure but did not find any sources. Would appreciate any thoughts!
This is actually a common pattern. It would be solved by code like the following.
class Network(nn.Module):
def __init__(self, ...):
self.encoder = DrugTargetInteractiongNetwork()
self.mlp1 = ClassificationMLP()
self.mlp2 = PairwiseMLP()
def forward(self, data_a, data_b):
a_encoded = self.encoder(data_a)
b_encoded = self.encoder(data_b)
a_classified = self.mlp1(a_encoded)
b_classified = self.mlp1(b_encoded)
# let me assume data_a and data_b are of shape
# [batch_size, n_molecules, n_features].
# and that those n_molecules are not necessarily
# equal.
# This can be generalized to more dimensions.
a_broadcast, b_broadcast = torch.broadcast_tensors(
a_encoded[:, None, :, :],
b_encoded[:, :, None, :],
)
# this will work if your mlp2 accepts an arbitrary number of
# learding dimensions and just broadcasts over them. That's true
# for example if it uses just Linear and pointwise
# operations, but may fail if it makes some specific assumptions
# about the number of dimensions of the inputs
pairwise_classified = self.mlp2(a_broadcast, b_broadcast)
# if that is a problem, you have to reshape it such that it
# works. Most torch models accept at least a leading batch dimension
# for vectorization, so we can "fold" the pairwise dimension
# into the batch dimension, presenting it as
# [batch*n_mol_1*n_mol_2, n_features]
# to mlp2 and then recover it back
B, N1, N_feat = a_broadcast.shape
_B, N2, _N_feat = b_broadcast.shape
a_batched = a_broadcast.reshape(B*N1*N2, N_feat)
b_batched = b_broadcast.reshape(B*N1*N2, N_feat)
# above, -1 would suffice instead of B*N1*N2, just being explicit
batch_output = self.mlp2(a_batched, b_batched)
# this should be exactly the same as `pairwise_classified`
alternative_classified = batch_output.reshape(B, N1, N2, -1)
return a_classified, b_classified, pairwise_classified
Hey I'm working on pong and regardless of how hard I try, I keep facing the same problem; the scoring system doesn't appear on the window when I run the code.
I was told I need to initialize attributes such as scoring system, ball and paddles and I guess I did so for the ball and paddle even though I don't really get what that means since this course is the farthest thing from being kind to students who are new to python. But I guess I didn't do so for the scoring system since it's not appearing on the screen.
I've been struggling with these past few weeks and still haven't found a way to fix this and I need to hand in this next week which is making me freak out.
I would appreciate it if you can explain me what did I do wrong and what initializing attribute means and also check my codes for collision detection!
Thank you all SO MUCH!! (By the way, this always happens and confuses me a lot since I get no error message but the scoring system is just not working as I expected)
# Pong V2
# Pong game needs two players and players try to prevent the ball from hitting
# the edge of the screen of their side. If the ball hits the edge of the window,
# the counterpart of the player who couldn't defence the wall gains a point.
# When one of the players reach 11 points, game ends and the player with higher
# points win the game. Players and move the paddles of their own by pressing
# some keys and ball bounces off when the ball has hit the front part of the
# paddle but passes through when they hit the back part of the paddle.
# wall, game continuing until the player presses 'x' button to close the window,
# paddles on the screen but cannot be moved.
# Version 2 of Pong implies the scoring system and game ending when someone hits
# 11 scores. Also, now collision detection for the paddles and the ball has been
# added as a new feature even though still player cannot move the paddles.
import pygame
from random import randint
def main():
# initialize all pygame modules (some need initialization)
pygame.init()
# create a pygame display window
pygame.display.set_mode((500, 400))
# set the title of the display window
pygame.display.set_caption('Pong')
# get the display surface
w_surface = pygame.display.get_surface()
# create a game object
game = Game(w_surface)
# start the main game loop by calling the play method on the game object
game.play()
# quit pygame and clean up the pygame window
pygame.quit()
class Game:
def __init__(self, surface):
# Initialize a Game.
self.surface = surface
self.bg_color = pygame.Color('black')
self.FPS = 60
self.game_Clock = pygame.time.Clock()
self.close_clicked = False
self.continue_game = True
#iniotialize the ball
self.ball = Ball('white', 5, [250, 200], [1, 2], self.surface)
self.max_frames = 150
self.frame_counter = 0
# initialize paddles
self.paddleA = pygame.Rect((70, 170), (10, 60))
self.paddleB = pygame.Rect((415, 170), (10, 60))
# initialize paddles
self.scoreA = 0
self.scoreB = 0
def draw_score_A(self):
# this method draws the player's score in the top-left corner of the
# game window.
strscoreA = str(self.scoreA)
font_color = pygame.Color('white')
font_bg = pygame.Color('black')
font = pygame.font.SysFont("arial", 72)
text_img = font.render(strscoreA, True, font_color, font_bg)
text_pos = (0, 0)
self.surface.blit(text_img, text_pos)
def draw_score_B(self):
# this method draws the player's score in the top-right corner of the
# game window.
strscoreB = str(self.scoreB)
font_color = pygame.Color('white')
font_bg = pygame.Color('black')
font = pygame.font.SysFont("arial", 72)
text_img = font.render(strscoreB, True, font_color, font_bg)
text_pos = (425,0)
self.surface.blit(text_img, text_pos)
def update_score(self):
# check if the ball has hit the left or right side edge and update score
# if the ball hit left side, scoreB goes up by 1 and if the ball hit the right
# side the scoreA goes up by 1.
if self.ball.center[0] < self.ball.radius:
self.scoreB += 1
if self.ball.center[0] + self.ball.radius > self.surface.get_width():
self.scoreA += 1
return self.scoreA, self.scoreB
def play(self):
# Play the game until the player presses the close box.
# - self is the Game that should be continued or not.
while not self.close_clicked: # until player clicks close box
# play frame
self.handle_events()
self.draw()
self.update_score()
self.draw_score_A()
self.draw_score_B()
if self.continue_game:
self.update()
self.decide_continue()
self.game_Clock.tick(self.FPS) # run at most with FPS Frames Per Second
def handle_events(self):
# Handle each user event by changing the game state appropriately.
# - self is the Game whose events will be handled
events = pygame.event.get()
for event in events:
if event.type == pygame.QUIT:
self.close_clicked = True
def draw(self):
# Draw all game objects.
self.surface.fill(self.bg_color) # clear the display surface first
# draw ball on the surface
self.ball.draw()
# draw both paddles on the surface
paddleA = pygame.Rect((70, 170), (10, 60))
self.paddleA = pygame.draw.rect(self.surface, (255, 255, 255), paddleA)
paddleB = pygame.Rect((415, 170), (10, 60))
self.paddleB = pygame.draw.rect(self.surface, (255, 255, 255), paddleB)
pygame.display.update() # make the updated surface appear on the display
def collision_detection(self):
# figure out if the paddle and the ball has collide or not.
# If the ball has hit the front of the paddle, it is recognizes as
# a collision and bounces off to the opposite direction. However
# if the ball has hit the backside of the paddle, this is not
# considered as a collision and this will not make the ball bounce of but
# go through the paddle.We need to check if the center of the ball has passed
# the center of the paddle or not and for this we will use rect.collidepoint
if self.paddleA.collidepoint(self.center) and self.velocity[index] > 0 is True:
self.velocity[index] = self.velocity[index]
elif self.paddleA.collidepoint(self.center) and self.velocity[index] < 0 is True:
self.velocity[index] = - self.velocity[index]
elif self.paddleB.collidepoint(self.center) and self.velocity[index] > 0 is True:
self.velocity[index] = - self.velocity[index]
elif self.paddleB.collidepoint(self.center) and self.velocity[index] < 0 is True:
self.velocity[index] = self.velocity[index]
def update(self):
# Update the game objects for the next frame.
self.ball.move()
self.frame_counter += self.frame_counter
def decide_continue(self):
# Check and remember if the game should continue
# if the score of one of the players reaches 11, the game should end.
# thus we need to check if anyone's score has reached 11 or not and
# decide if we'll continue the game or not.
if self.scoreA == 11 or self.scoreB == 11 :
self.continue_game = False
class Ball:
# An object in this class represents a ball that moves
def __init__(self, ball_color, ball_radius, ball_center, ball_velocity,
surface):
# Initialize a ball.
# - self is the ball to initialize
# - color is the pygame.Color of the ball
# - center is a list containing the x and y int
# coords of the center of the dot
# - radius is the int pixel radius of the ball
# - velocity is a list containing the x and y components
# - surface is the window's pygame.Surface object
self.color = pygame.Color(ball_color)
self.radius = ball_radius
self.center = ball_center
self.velocity = ball_velocity
self.surface = surface
def move(self):
# Change the location of the ball by adding the corresponding
# speed values to the x and y coordinate of its center
# we need the height and width of the screen to determine when ball
# should change their direction of motion (bounce)
screen_width = self.surface.get_width()
screen_height = self.surface.get_height()
# updates the ball's position based on its velocity and also bounces the
# ball off a wall if it gets too close
screen_size = (screen_width, screen_height)
for index in range(0, len(self.center)):
self.center[index] += self.velocity[index]
if (self.center[index] <= 0 + self.radius or self.center[index] >=
screen_size[index] - self.radius):
self.velocity[index] = -self.velocity[index]
def draw(self):
# Draw the ball on the surface
pygame.draw.circle(self.surface, self.color, self.center, self.radius)
main()
The code is calling draw() (to repaint and flush the screen), and then calling draw_score_A() and draw_score_B() without flushing. So the changes made by the draw_score_ functions are never seen by the user.
If you have a function draw() - make it draw everything. This keeps all the drawing and flushing operations in a single place. Sure the draw() function calls other functions to do the painting, but it's all starts from one place.
def draw( self ):
# Draw all game objects.
# clear the display surface first
self.surface.fill( self.bg_color )
# draw ball on the surface
self.ball.draw()
# draw both paddles on the surface
pygame.draw.rect( self.surface, ( 255, 255, 255 ), self.paddleA )
pygame.draw.rect( self.surface, ( 255, 255, 255 ), self.paddleB )
# Paint the scores
self.draw_score_A()
self.draw_score_B()
# make the updated surface appear on the display
pygame.display.update()
Also, your draw() function seems to be re-defining paddleA and paddleB as part of the drawing. I corrected this in the version above.
I am trying to implement q-learning with an action-value approximation-function. I am using openai-gym and the "MountainCar-v0" enviroment to test my algorithm out. My problem is, it does not converge or find the goal at all.
Basically the approximator works like the following, you feed in the 2 features: position and velocity and one of the 3 actions in a one-hot encoding: 0 -> [1,0,0], 1 -> [0,1,0] and 2 -> [0,0,1]. The output is the action-value approximation Q_approx(s,a), for one specific action.
I know that usually, the input is the state (2 features) and the output layer contains 1 output for each action. The big difference that I see is that I have run the feed forward pass 3 times (one for each action) and take the max, while in the standard implementation you run it once and take the max over the output.
Maybe my implementation is just completely wrong and I am thinking wrong. Gonna paste the code here, it is a mess but I am just experimenting a bit:
import gym
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation
env = gym.make('MountainCar-v0')
# The mean reward over 20 episodes
mean_rewards = np.zeros(20)
# Feature numpy holder
features = np.zeros(5)
# Q_a value holder
qa_vals = np.zeros(3)
one_hot = {
0 : np.asarray([1,0,0]),
1 : np.asarray([0,1,0]),
2 : np.asarray([0,0,1])
}
model = Sequential()
model.add(Dense(20, activation="relu",input_dim=(5)))
model.add(Dense(10,activation="relu"))
model.add(Dense(1))
model.compile(optimizer='rmsprop',
loss='mse',
metrics=['accuracy'])
epsilon_greedy = 0.1
discount = 0.9
batch_size = 16
# Experience replay containing features and target
experience = np.ones((10*300,5+1))
# Ring buffer
def add_exp(features,target,index):
if index % experience.shape[0] == 0:
index = 0
global filled_once
filled_once = True
experience[index,0:5] = features
experience[index,5] = target
index += 1
return index
for e in range(0,100000):
obs = env.reset()
old_obs = None
new_obs = obs
rewards = 0
loss = 0
for i in range(0,300):
if old_obs is not None:
# Find q_a max for s_(t+1)
features[0:2] = new_obs
for i,pa in enumerate([0,1,2]):
features[2:5] = one_hot[pa]
qa_vals[i] = model.predict(features.reshape(-1,5))
rewards += reward
target = reward + discount*np.max(qa_vals)
features[0:2] = old_obs
features[2:5] = one_hot[a]
fill_index = add_exp(features,target,fill_index)
# Find new action
if np.random.random() < epsilon_greedy:
a = env.action_space.sample()
else:
a = np.argmax(qa_vals)
else:
a = env.action_space.sample()
obs, reward, done, info = env.step(a)
old_obs = new_obs
new_obs = obs
if done:
break
if filled_once:
samples_ids = np.random.choice(experience.shape[0],batch_size)
loss += model.train_on_batch(experience[samples_ids,0:5],experience[samples_ids,5].reshape(-1))[0]
mean_rewards[e%20] = rewards
print("e = {} and loss = {}".format(e,loss))
if e % 50 == 0:
print("e = {} and mean = {}".format(e,mean_rewards.mean()))
Thanks in advance!
There shouldn't be much difference between the actions as inputs to your network or as different outputs of your network. It does make a huge difference if your states are images for example. because Conv nets work very well with images and there would be no obvious way of integrating the actions to the input.
Have you tried the cartpole balancing environment? It is better to test if your model is working correctly.
Mountain climb is pretty hard. It has no reward until you reach the top, which often doesn't happen at all. The model will only start learning something useful once you get to the top once. If you are never getting to the top you should probably increase your time doing exploration. in other words take more random actions, a lot more...
How do you get the screen contents into an array in pygame? I have tried this from the documentation:
self.screen.lock()
tmp_frame = pygame.surfarray.array3d(self.screen)
self.screen.unlock()
I have tried all sorts of things, such as using pixelcopy to get a copy of the surface first, but I always get a segmentation fault.
Fatal Python error: (pygame parachute) Segmentation Fault Aborted
(core dumped)
Is it because I am trying to copy from the screen directly?
This is what screen contains:
self.screen = pygl2d.display.set_mode((self.SCREEN_WIDTH, self.SCREEN_HEIGHT), pygame.DOUBLEBUF, depth=24)
And this is the definition of set_mode:
def set_mode(resolution=(0,0), flags=0, depth=0):
flags |= pygame.OPENGL
screen = pygame.display.set_mode(resolution, flags, depth)
init_gl()
return screen
Edit followup:
I also tried copying the screen surface into another surface first with
tmp_surface= self.screen.copy()
But I get
pygame.error: Cannot copy opengl display
So really, I suppose the question is how do you copy this opengl display contents into an array?
To anyone who might experience something similar:
I was not able to find a direct solution, all methods of accessing the hardware accelerated surface resulted in a segmentation error. (array3d,array2d,accessing a reference array pixels3d etc).
However, I was able to figure out a workaround. It appears that you are able to save images with
pygame.image.save(self.screen, 'output.png')
Similarly, you can do
string_image = pygame.image.tostring(self.screen, 'RGB')
temp_surf = pygame.image.fromstring(string_image,(self.SCREEN_WIDTH, self.SCREEN_HEIGHT),'RGB' )
tmp_arr = pygame.surfarray.array3d(temp_surf)
This should get you a numpy array of the screen contents.
import pygame
import numpy as np
import time
from pandas import *
pygame.init()
display = pygame.display.set_mode((350, 350))
x = np.arange(0, 300)
y = np.arange(0, 300)
X, Y = np.meshgrid(x, y)
Z = X + Y
Z = 255*Z/Z.max()
surf = pygame.surfarray.make_surface(Z)
running = True
while running:
for event in pygame.event.get():
if event.type == pygame.QUIT:
running = False
display.blit(surf, (0, 0))
pygame.display.update()
# Convert the window in black color(2D) into a matrix
window_pixel_matrix = pygame.surfarray.array2d(display)
print(window_pixel_matrix)
time.sleep(10)
pygame.quit()
COMMENT :
"pygame.surfarray.array2d()" is what you're looking for I guess.
Of course, you can use "pygame.surfarray.array3d()" function as well.
You can refer to the official website : "https://www.pygame.org/docs/ref/surfarray.html#pygame.surfarray.array2d"
Does any one know any example code of an algorithm Ronald J. Williams proposed in
A class of gradient-estimating algorithms for reinforcement learning in neural networks
Yes, do a search on GitHub, and you will get a whole bunch of results:
GitHub: WILLIAMS+REINFORCE
The most popular ones use this code (in Python):
__author__ = 'Thomas Rueckstiess, ruecksti#in.tum.de'
from pybrain.rl.learners.directsearch.policygradient import PolicyGradientLearner
from scipy import mean, ravel, array
class Reinforce(PolicyGradientLearner):
""" Reinforce is a gradient estimator technique by Williams (see
"Simple Statistical Gradient-Following Algorithms for
Connectionist Reinforcement Learning"). It uses optimal
baselines and calculates the gradient with the log likelihoods
of the taken actions. """
def calculateGradient(self):
# normalize rewards
# self.ds.data['reward'] /= max(ravel(abs(self.ds.data['reward'])))
# initialize variables
returns = self.dataset.getSumOverSequences('reward')
seqidx = ravel(self.dataset['sequence_index'])
# sum of sequences up to n-1
loglhs = [sum(self.loglh['loglh'][seqidx[n]:seqidx[n + 1], :]) for n in range(self.dataset.getNumSequences() - 1)]
# append sum of last sequence as well
loglhs.append(sum(self.loglh['loglh'][seqidx[-1]:, :]))
loglhs = array(loglhs)
baselines = mean(loglhs ** 2 * returns, 0) / mean(loglhs ** 2, 0)
# TODO: why gradient negative?
gradient = -mean(loglhs * (returns - baselines), 0)
return gradient