pygame.sprite giving error RecursionError: maximum recursion depth exceeded while calling a Python object - pygame

My game's code is a simple platform game, but when try to add sprites, it gives me an error with RecursionError: maximum recursion depth exceeded while calling a Python object
I follow a rough tutorial and try to modify the code for myself, i'm quite new to coding so can anyone help solve this?
Ive seen people turn the recursion into a loop, which i don't understand what it is.
class Enemy(pygame.sprite.Sprite()):
def __init__(self, x, y):
pygame.sprite.Sprite.__init__(self)
self.image = pygame.image.load('image/spida.png')
self.rect = self.image.get_rect()
self.rect.x = x
self.rect.y = y
heres the chunk of code that is giving me an error, the specific part is pygame.sprite.Sprite.__init__(self) it highlights the self part, saying Expected type 'Sprite', got 'Enemy' instead(this is in the project errors part of code)

There is a typo in your code pygame.sprite.Sprite()creates a new instance object. Change:
class Enemy(pygame.sprite.Sprite()):
class Enemy(pygame.sprite.Sprite)

Related

Higher loss while training a matrix factorization problem using larger batch size

First I want to say thank to anyone consider reading this question, and I want to sorry if my question is so stubborn, and for my poor English.
So currently I'm working on a recommendation system problem, and my approach was to use matrix factorization with implicit feedback using BPR (arXiv:1205.2618). Somehow, I discovered that when I trained my model (BPRMF), using a large batch size (in this case 4096), resulted in a poorer BPR loss compared to when I used a smaller batch size (1024). my training log on few epochs.
I noted that higher batch size resulted in faster training time as it can utilize GPU memory more efficiently, but the higher loss is something maybe I'm not so willingly to trade. As far as I know, a large batch size bring much more information for the gradient descent step to take a better step, so it should help with convergence, and usually problem with large batch size is in memory and resource, not with loss.
I have did some research about this, and saw that Large Batch Training Result in Poor Generalization and here another, but in my case, it was poor lost while in training.
My best guess is that using a large batch size, then take the mean of the loss make the gradient flow to the user and item embedding lower by the mean ( 1 / batch size) coefficient, make it hard to escape local maxima while training. Is it the answer in this case ? (However, I saw that recent study has show that local minima is not necessarily bad, so ...)
Really appreciated anybody help me answer why large batchsize ended up with anomaly results.
Side note: Might be another stupid question, but as you can see in the code below, you can see that the l2 loss is not normalized by batch size, so I expected it to at least double or quadruple when I multiply batch size by 4, but that seem not to be the case here in the log above.
Here is my code
from typing import Tuple
import torch
from torch.nn.parameter import Parameter
import torch.nn.functional as F
from .PretrainedModel import PretrainedModel
class BPRMFModel(PretrainedModel):
def __init__(self, n_users: int, n_items: int, u_embed: int, l2:float,
dataset: str, u_i_pretrained_dir, use_pretrained = 0, **kwargs) -> None:
super().__init__(n_users=n_users, n_items=n_items, u_embed=u_embed, dataset=dataset,
u_i_pretrained_dir=u_i_pretrained_dir, use_pretrained=use_pretrained,
**kwargs)
self.l2 = l2
self.reset_parameters()
self.items_e = Parameter(self._items_e)
self.users_e = Parameter(self._users_e)
def forward(self, u: torch.Tensor, i: torch.Tensor) -> torch.Tensor:
u = F.embedding(u, self.users_e)
i = F.embedding(i, self.items_e)
return torch.matmul(u, i.T)
def CF_loss(self, u: torch.Tensor, i_pos: torch.Tensor, i_neg: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
#u, i_pos, i_neg shape is [batch_size,]
u = F.embedding(u, self.users_e)
i_pos = F.embedding(i_pos, self.items_e)
i_neg = F.embedding(i_neg, self.items_e)
pos_scores = torch.einsum("ij,ij->i", u, i_pos)
neg_scores = torch.einsum("ij,ij->i", u, i_neg)
# loss = torch.mean(
# F.softplus(-(pos_scores - neg_scores))
# )
loss = torch.neg(
torch.mean(
F.logsigmoid(pos_scores - neg_scores)
)
)
l2_loss = (
u.pow(2).sum() +
i_pos.pow(2).sum() +
i_neg.pow(2).sum()
)
return loss, self.l2 * l2_loss
def get_users_rating_for_each_items(self, u: torch.Tensor, i: torch.Tensor) -> torch.Tensor:
return self(u, i)
def save_pretrained(self):
self._items_e = self.items_e.data
self._users_e = self.users_e.data
return super().save_pretrained()
PretrainedModel is just a base class helping me with the save and load model weight
Really appreciated anyone who bear with me till this end.

Why doesn't Detectron2 not detect any other instances after transfer learning?

I have followed the tutorial from this website https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=U5LhISJqWXgM.
The question is: there were images that contained 'person' and many other instances but why weren't they detected and segmented?
from detectron2.utils.visualizer import ColorMode
dataset_dicts = get_balloon_dicts("balloon/val")
for d in random.sample(dataset_dicts, 3):
im = cv2.imread(<CUSTOM_IMAGE_CONTAINING_PERSON_DETECTED_WHEN_CUSTOM_WASNT_USED>) <--changed
outputs = predictor(im)
v = Visualizer(im[:, :, ::-1],
metadata=balloon_metadata,
scale=0.8,
instance_mode=ColorMode.IMAGE_BW # remove the colors of unsegmented pixels
)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])
After downloading a custom photo containing people I ran inference on the image as instructured under Run a pre-trained detectron2 model using the model they used (model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) but when I use the trained model of the balloons it does not detect the people in the image (the threshold was not the problem since I used 0.5 in both cases). Why is it so and how would I be able to make it show all the instances? Help would be greatly appreciated :D

Proper way to extract embedding weights for CBOW model?

I'm currently trying to implement the CBOW model on managed to get the training and testing, but am facing some confusion as to the "proper" way to finally extract the weights from the model to use as our word embeddings.
Model
class CBOW(nn.Module):
def __init__(self, config, vocab):
self.config = config # Basic config file to hold arguments.
self.vocab = vocab
self.vocab_size = len(self.vocab.token2idx)
self.window_size = self.config.window_size
self.embed = nn.Embedding(num_embeddings=self.vocab_size, embedding_dim=self.config.embed_dim)
self.linear = nn.Linear(in_features=self.config.embed_dim, out_features=self.vocab_size)
def forward(self, x):
x = self.embed(x)
x = torch.mean(x, dim=0) # Average out the embedding values.
x = self.linear(x)
return x
Main process
After I run my model through a Solver with the training and testing data, I basically told the train and test functions to also return the model that's used. Then I assigned the embedding weights to a separate variable and used those as the word embeddings.
Training and testing was conducted using cross entropy loss, and each training and testing sample is of the form ([context words], target word).
def run(solver, config, vocabulary):
for epoch in range(config.num_epochs):
loss_train, model_train = solver.train()
loss_test, model_test = solver.test()
embeddings = model_train.embed.weight
I'm not sure if this is the correct way of going about extracting and using the embeddings. Is there usually another way to do this? Thanks in advance.
Yes, model_train.embed.weight will give you a torch tensor that stores the embedding weights. Note however, that this tensor also contains the latest gradients. If you don't want/need them, model_train.embed.weight.data will give you the weights only.
A more generic option is to call model_train.embed.parameters(). This will give you a generator of all the weight tensors of the layer. In general, there are multiple weight tensors in a layer and weight will give you only one of them. Embedding happens to have only one, so here it doesn't matter which option you use.

embedding layer outputs nan

I am trying to learn a seq2seq model.
An embedding layer is located in the encoder and it sometimes outputs nan value after some iterations.
I cannot identify the reason.
How can I solve this??
The problem is the first emb_layer in the forward function in the code below.
class TransformerEncoder(nn.Module):
def __init__(self, vocab_size, hidden_size=1024, num_layers=6, dropout=0.2, input_pad=1, batch_first=False, embedder=None, init_weight=0.1):
super(TransformerEncoder, self).__init__()
self.input_pad = input_pad
self.vocab_size = vocab_size
self.num_layers = num_layers
self.embedder = embedder
if embedder is not None:
self.emb_layer = embedder
else:
self.emb_layer = nn.Embedding(vocab_size, hidden_size, padding_idx=1)
self.positional_encoder = PositionalEncoder()
self.transformer_layers = nn.ModuleList()
for _ in range(num_layers):
self.transformer_layers.append(
TransformerEncoderBlock(num_heads=8, embedding_dim=1024, dropout=dropout))
def set_mask(self, inputs):
self.input_mask = (inputs == self.input_pad).unsqueeze(1)
def forward(self, inputs):
x = self.emb_layer(inputs)
x = self.positional_encoder(x)
It is usually the inputs more than the weights which tend to become nan (either goes too high or too low). Maybe these are incorrect to start out with and worsen after some gradients. You can identify these inputs by running the tensor or np.array thru' a simple condition check like:
print("Inp value too high") if len(bert_embeddings[bert_embeddings>1000]) > 1 else None
A common mistake for a beginner is to use a torch.empty instead of torch.zeros. This invariably leads to Nan over time.
If all your inputs are good, then it is the vanishing or exploding gradients issue. See if the problem worsens after a few iterations. Explore different activations or clipping gradients which usually fix these type of issues. If you are using latest optimizers you usually need not worry about adjusting the learning rate.
It looks like some weights become nan. The one of the possible reasons is that on some iteration a layer output is +-inf. If it output is +-inf on forward, on backward it will have a +-inf and as inf - inf = none, the weights will become none, and at all following iterations will output none.
You may check this just by tracking inf outputs in emb_layer.
If this is the reason, just try to avoid functions that may return inf values.
My db was so small compare to the dimensions of the tutorial I was following and my embeddings were way to big for my data available, so eventually the NaN propagates through the network. Making my embedding nets smaller (smaller number of factors / columns in the matrix) solved the NaN problem for me.

How to implement low-dimensional embedding layer in pytorch

I recently read a paper about embedding.
In Eq. (3), the f is a 4096X1 vector. the author try to compress the vector in to theta (a 20X1 vector) by using an embedding matrix E.
The equation is simple theta = E*f
I was wondering if it can using pytorch to achieve this goal, then in the training, the E can be learned automatically.
How to finish the rest? thanks so much.
The demo code is follow:
import torch
from torch import nn
f = torch.randn(4096,1)
Assuming your input vectors are one-hot that is where "embedding layers" are used, you can directly use embedding layer from torch which does above as well as some more things. nn.Embeddings take non-zero index of one-hot vector as input as a long tensor. For ex: if feature vector is
f = [[0,0,1], [1,0,0]]
then input to nn.Embeddings will be
input = [2, 0]
However, what OP has asked in question is getting embeddings by matrix multiplication and below I will address that. You can define a module to do that as below. Since, param is an instance of nn.Parameter it will be registered as a parameter and will be optimized when you call Adam or any other optimizer.
class Embedding(nn.Module):
def __init__(self, input_dim, embedding_dim):
super().__init__()
self.param = torch.nn.Parameter(torch.randn(input_dim, embedding_dim))
def forward(self, x):
return torch.mm(x, self.param)
If you notice carefully this is the same as a linear layer with no bias and slightly different initialization. Therefore, you can achieve the same by using a linear layer as below.
self.embedding = nn.Linear(4096, 20, bias=False)
# change initial weights to normal[0,1] or whatever is required
embedding.weight.data = torch.randn_like(embedding.weight)