I am studying the documentation at https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html.
In the parameters section, it states
return_indices – if True, will return the max indices along with the outputs. Useful for torch.nn.MaxUnpool2d later
Could someone explain to me what max indices mean here? I believe it is the indices corresponding to the maximal value. If the maximal value is unique, does that mean only 1 index will be returned?
I assume you already know how max pooling works.
Then, let's print some results to get more insights.
import torch
import torch.nn as nn
pool = nn.MaxPool2d(kernel_size=2, return_indices=True)
input = torch.zeros(1, 1, 4, 4)
input[..., 0, 1] = input[..., 1, 3] = input[..., 2, 2] = input[..., 3, 0] = 1.
print(input)
output
tensor([[[[0., 1., 0., 0.],
[0., 0., 0., 1.],
[0., 0., 1., 0.],
[1., 0., 0., 0.]]]])
output, indices = pool(input)
print(output)
output
tensor([[[[1., 1.],
[1., 1.]]]])
print(indices)
output
tensor([[[[ 1, 7],
[12, 10]]]])
If you stretch the input tensor and make it 1d, you can see that indices contains the positions of each 1 value (the maximum for each window of MaxPool2d). As written in the documentation of torch.nn.MaxPool2d, indices is required for the torch.nn.MaxUnpool2d module:
MaxUnpool2d takes in as input the output of MaxPool2d including the indices of the maximal values and computes a partial inverse in which all non-maximal values are set to zero.
Related
I am trying to use Hugginface's AutoModelForSequenceClassification API for multi-class classification but am confused about its configuration.
My dataset is in one hot encoded and the problem type is multi-class (one label at a time)
What I have tried:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased",
num_labels=6,
id2label=id2label,
label2id=label2id)
batch_size = 8
metric_name = "f1"
from transformers import TrainingArguments, Trainer
args = TrainingArguments(
f"bert-finetuned-english",
evaluation_strategy = "epoch",
save_strategy = "epoch",
learning_rate=2e-5,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
num_train_epochs=10,
weight_decay=0.01,
load_best_model_at_end=True,
metric_for_best_model=metric_name,
#push_to_hub=True,
)
trainer = Trainer(
model,
args,
train_dataset=encoded_dataset["train"],
eval_dataset=encoded_dataset["test"],
tokenizer=tokenizer,
compute_metrics=compute_metrics
)
Is it correct?
I am confused about the loss function, when I am printing one forward pass the loss is BinaryCrossEntropyWithLogitsBackward
SequenceClassifierOutput([('loss',
tensor(0.6986, grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)),
('logits',
tensor([[-0.5496, 0.0793, -0.5429, -0.1162, -0.0551]],
grad_fn=<AddmmBackward0>))])
which is used for multi-label or binary classification tasks. It should use 'nn.CrossEntropyLoss' ? How to properly use this API for multiclass and define the loss function?
You have six classes, with values 1 or 0 in each cell for encoding. For example, a tensor [0., 0., 0., 0., 1., 0.] is representation a fifth class. Our task is predict six labels([1., 0., 0., 0., 0., 0.] ) and compare them with ground truth([0., 0., 0., 0., 1., 0.] ). For training we use loss function BinaryCrossEntropyWithLogitsBackward
I want to call torch.nn.utils spectral_norm function on a GCNConv layer
gc1 = GCNConv(18, 16)
spectral_norm(gc1)
but I'm getting the following error:
KeyError: 'weight'
meaning gc1._parameters doesn't have weight (only bias):
gc1._parameters
OrderedDict([('bias', Parameter containing:
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
requires_grad=True))])
However, gc1.parameters() stores two objects and one of them is a 16 by 18 matrix (weight matrix).
for p in gc1.parameters():
print('P: ', p.shape)
P: torch.Size([16])
P: torch.Size([16, 18])
How can I make spectral_norm function work on a GCNConv module?
According to the source code, the weight parameter is wrapped within a linear module contained in GCNConv objects as lin.
I imagine that this should then work:
gc1 = GCNConv(18, 16)
spectral_norm(gc1.lin)
I am trying to implement transformers and stuck at one point.
Say I have input sequence of shape [2,20] where
2 is the number of sample and
20 is the number of words in sequence ( sequence length ).
So, I create an array like [0,1,2, ... 19] of shape [1,20]. Now I want to stack it , something like the final shape should be [2,20] to be in-line with input sequence. Like below
[[0,1,2, ... 19],
[0,1,2, ... 19]]
Is there a torch function for doing so. I can loop and create the data and arrays but wanted to avoid it.
If the tensors you want to stack are of shape [1,20], you can use torch.cat()
t1 = torch.zeros([1,5]) # tensor([[0., 0., 0., 0., 0.]])
t2 = torch.ones([1,5]) # tensor([[1., 1., 1., 1., 1.]])
torch.cat([t1, t2]) # tensor([[0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1.]])
If the tensors are 1-D, you can simply use torch.stack()
t1 = torch.zeros([5]) # tensor([0., 0., 0., 0., 0.])
t2 = torch.ones([5]) # tensor([1., 1., 1., 1., 1.])
torch.stack([t1, t2]) # tensor([[0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1.]])
Now, for a shorter method for your case, you can do:
torch.arange(0,20).repeat(2,1) # tensor([[0,1,2, ... 19],
[0,1,2, ... 19]])
I am doing multi-class semantic segmentation(4 classes + background). My mask dimension is (256, 256, 3) and the output mask dimension is (256, 256, 5). I took 5 because it is the number of classes.
Dice Loss Function
inputs = inputs.view(-1)
targets = targets.view(-1)
intersection = (inputs * targets).sum() ---> error
dice = (2.*intersection + smooth)/(inputs.sum() + targets.sum() + smooth)
return 1 - dice
What should I do to make the two dimensions the same? The mask was extracted from a TIF file.
I have attached my mask image below.
I'm assuming the target segmentation you are showing is a RGB encoded map. You are looking to convert this 3-channel image into a 1-channel label map.
Assuming seg is your ground-truth segmentation map shaped as (b, 3, h, w). The label to color mapping can be arbitrarily set as:
colors = torch.FloatTensor([[0, 0, 0],
[1, 1, 0],
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
For each color construct a mask of matching pixel and assign the corresponding label in a new tensor at those pixel positions:
b, _, h, w = seg.shape
gt = torch.zeros(b,1,h,w)
seg_perm = seg.permute(0,2,3,1)
for label, color in enumerate(colors):
mask = torch.all(seg_perm == color, dim=-1).unsqueeze(1)
gt[mask] = label
For example taking the following segmentation map:
>>> seg = tensor([[[[1., 1., 0., 0.],
[1., 0., 0., 0.]],
[[0., 1., 0., 0.],
[0., 1., 0., 1.]],
[[0., 0., 0., 0.],
[0., 0., 1., 0.]]]])
For visualization purposes:
>>> T.ToPILImage()(seg[0].repeat_interleave(100,2).repeat_interleave(100,1))
And the resulting label map will:
>>> gt
tensor([[[[2., 1., 0., 0.],
[2., 3., 4., 3.]]]])
I believe you must first one-hot encode the target mask.
I suggest you read this good article first to get a better grasp of all the subtleties of semantic segmentation https://www.jeremyjordan.me/semantic-segmentation/.
Make sure prediction and target shape match, no need to flatten tensors with view(-1).
Also as a personal advice, prefer channel first for Pytorch tensors.
This question already has an answer here:
How to add parameters in module class in pytorch custom model?
(1 answer)
Closed 1 year ago.
Recently I tried to use custom conv2d convolutional layer parameters, the current setting is:
kernel = [np.array([[0., 0., 0.], [0., -1., 0.], [0., 1., 0.]]),
np.array([[0., 0., 0.], [0., -1., 1.], [0., 0., 0.]]),
np.array([[0., 0., 0.], [0., -1., 0.], [0., 0., 1.]])]
self.weights = []
for kernel_filter in kernel:
kernel_filter = torch.FloatTensor(kernel_filter).unsqueeze(0).unsqueeze(0) # (1, 1, 3, 3)
kernel_filter = np.repeat(kernel_filter, self.channels, axis=0)
self.weights.append(nn.Parameter(data=kernel_filter, requires_grad=True))
...
for weight in self.weights:
image_filtered = F.conv2d(x, weight, stride=[1, 1], padding=1, groups=self.channels)
After training several times, I checked the parameter values and found that there is no change. May I ask why this is, or what is wrong with my understanding here.
Print out all of the model parameters and ensure that they are registered, I suspect they are not. See here: How to add parameters in module class in pytorch custom model?. In short, just because self.weights is an attribute of the model class doesn't mean the model recognizes it as a learnable parameter. To indicate this to the model, you call self.register_parameter.
(As such, this question contains the same info as the linked question so perhaps should be closed, though I could perhaps see this being a useful avenue for people less familiar with terminology to reach the info.)