indices in MaxPool2d in pytorch

indices in MaxPool2d in pytorch - deep-learning

I am studying the documentation at https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html.
In the parameters section, it states
return_indices – if True, will return the max indices along with the outputs. Useful for torch.nn.MaxUnpool2d later
Could someone explain to me what max indices mean here? I believe it is the indices corresponding to the maximal value. If the maximal value is unique, does that mean only 1 index will be returned?

I assume you already know how max pooling works.
Then, let's print some results to get more insights.
import torch
import torch.nn as nn
pool = nn.MaxPool2d(kernel_size=2, return_indices=True)
input = torch.zeros(1, 1, 4, 4)
input[..., 0, 1] = input[..., 1, 3] = input[..., 2, 2] = input[..., 3, 0] = 1.
print(input)
output
tensor([[[[0., 1., 0., 0.],
[0., 0., 0., 1.],
[0., 0., 1., 0.],
[1., 0., 0., 0.]]]])
output, indices = pool(input)
print(output)
output
tensor([[[[1., 1.],
[1., 1.]]]])
print(indices)
output
tensor([[[[ 1, 7],
[12, 10]]]])
If you stretch the input tensor and make it 1d, you can see that indices contains the positions of each 1 value (the maximum for each window of MaxPool2d). As written in the documentation of torch.nn.MaxPool2d, indices is required for the torch.nn.MaxUnpool2d module:
MaxUnpool2d takes in as input the output of MaxPool2d including the indices of the maximal values and computes a partial inverse in which all non-maximal values are set to zero.

Related

Hugginface Multi-Class classification using AutoModelForSequenceClassification

I am trying to use Hugginface's AutoModelForSequenceClassification API for multi-class classification but am confused about its configuration.
My dataset is in one hot encoded and the problem type is multi-class (one label at a time)
What I have tried:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased",
num_labels=6,
id2label=id2label,
label2id=label2id)
batch_size = 8
metric_name = "f1"
from transformers import TrainingArguments, Trainer
args = TrainingArguments(
f"bert-finetuned-english",
evaluation_strategy = "epoch",
save_strategy = "epoch",
learning_rate=2e-5,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
num_train_epochs=10,
weight_decay=0.01,
load_best_model_at_end=True,
metric_for_best_model=metric_name,
#push_to_hub=True,
)
trainer = Trainer(
model,
args,
train_dataset=encoded_dataset["train"],
eval_dataset=encoded_dataset["test"],
tokenizer=tokenizer,
compute_metrics=compute_metrics
)
Is it correct?
I am confused about the loss function, when I am printing one forward pass the loss is BinaryCrossEntropyWithLogitsBackward
SequenceClassifierOutput([('loss',
tensor(0.6986, grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)),
('logits',
tensor([[-0.5496, 0.0793, -0.5429, -0.1162, -0.0551]],
grad_fn=<AddmmBackward0>))])
which is used for multi-label or binary classification tasks. It should use 'nn.CrossEntropyLoss' ? How to properly use this API for multiclass and define the loss function?

You have six classes, with values 1 or 0 in each cell for encoding. For example, a tensor [0., 0., 0., 0., 1., 0.] is representation a fifth class. Our task is predict six labels([1., 0., 0., 0., 0., 0.] ) and compare them with ground truth([0., 0., 0., 0., 1., 0.] ). For training we use loss function BinaryCrossEntropyWithLogitsBackward

spectral_norm on GCNConv module

I want to call torch.nn.utils spectral_norm function on a GCNConv layer
gc1 = GCNConv(18, 16)
spectral_norm(gc1)
but I'm getting the following error:
KeyError: 'weight'
meaning gc1._parameters doesn't have weight (only bias):
gc1._parameters
OrderedDict([('bias', Parameter containing:
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
requires_grad=True))])
However, gc1.parameters() stores two objects and one of them is a 16 by 18 matrix (weight matrix).
for p in gc1.parameters():
print('P: ', p.shape)
P: torch.Size([16])
P: torch.Size([16, 18])
How can I make spectral_norm function work on a GCNConv module?

According to the source code, the weight parameter is wrapped within a linear module contained in GCNConv objects as lin.
I imagine that this should then work:
gc1 = GCNConv(18, 16)
spectral_norm(gc1.lin)

Horizontal stacking in Pytorch

I am trying to implement transformers and stuck at one point.
Say I have input sequence of shape [2,20] where
2 is the number of sample and
20 is the number of words in sequence ( sequence length ).
So, I create an array like [0,1,2, ... 19] of shape [1,20]. Now I want to stack it , something like the final shape should be [2,20] to be in-line with input sequence. Like below
[[0,1,2, ... 19],
[0,1,2, ... 19]]
Is there a torch function for doing so. I can loop and create the data and arrays but wanted to avoid it.

If the tensors you want to stack are of shape [1,20], you can use torch.cat()
t1 = torch.zeros([1,5]) # tensor([[0., 0., 0., 0., 0.]])
t2 = torch.ones([1,5]) # tensor([[1., 1., 1., 1., 1.]])
torch.cat([t1, t2]) # tensor([[0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1.]])
If the tensors are 1-D, you can simply use torch.stack()
t1 = torch.zeros([5]) # tensor([0., 0., 0., 0., 0.])
t2 = torch.ones([5]) # tensor([1., 1., 1., 1., 1.])
torch.stack([t1, t2]) # tensor([[0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1.]])
Now, for a shorter method for your case, you can do:
torch.arange(0,20).repeat(2,1) # tensor([[0,1,2, ... 19],
[0,1,2, ... 19]])

Dimension mismatch between output mask and original mask in dice loss in Semantic segmentation

I am doing multi-class semantic segmentation(4 classes + background). My mask dimension is (256, 256, 3) and the output mask dimension is (256, 256, 5). I took 5 because it is the number of classes.
Dice Loss Function
inputs = inputs.view(-1)
targets = targets.view(-1)
intersection = (inputs * targets).sum() ---> error
dice = (2.*intersection + smooth)/(inputs.sum() + targets.sum() + smooth)
return 1 - dice
What should I do to make the two dimensions the same? The mask was extracted from a TIF file.
I have attached my mask image below.

I'm assuming the target segmentation you are showing is a RGB encoded map. You are looking to convert this 3-channel image into a 1-channel label map.
Assuming seg is your ground-truth segmentation map shaped as (b, 3, h, w). The label to color mapping can be arbitrarily set as:
colors = torch.FloatTensor([[0, 0, 0],
[1, 1, 0],
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
For each color construct a mask of matching pixel and assign the corresponding label in a new tensor at those pixel positions:
b, _, h, w = seg.shape
gt = torch.zeros(b,1,h,w)
seg_perm = seg.permute(0,2,3,1)
for label, color in enumerate(colors):
mask = torch.all(seg_perm == color, dim=-1).unsqueeze(1)
gt[mask] = label
For example taking the following segmentation map:
>>> seg = tensor([[[[1., 1., 0., 0.],
[1., 0., 0., 0.]],
[[0., 1., 0., 0.],
[0., 1., 0., 1.]],
[[0., 0., 0., 0.],
[0., 0., 1., 0.]]]])
For visualization purposes:
>>> T.ToPILImage()(seg[0].repeat_interleave(100,2).repeat_interleave(100,1))
And the resulting label map will:
>>> gt
tensor([[[[2., 1., 0., 0.],
[2., 3., 4., 3.]]]])

I believe you must first one-hot encode the target mask.
I suggest you read this good article first to get a better grasp of all the subtleties of semantic segmentation https://www.jeremyjordan.me/semantic-segmentation/.
Make sure prediction and target shape match, no need to flatten tensors with view(-1).
Also as a personal advice, prefer channel first for Pytorch tensors.

The learnable parameter problem of the conv2d layer in pytorch [duplicate]

This question already has an answer here:
How to add parameters in module class in pytorch custom model?
(1 answer)
Closed 1 year ago.
Recently I tried to use custom conv2d convolutional layer parameters, the current setting is:
kernel = [np.array([[0., 0., 0.], [0., -1., 0.], [0., 1., 0.]]),
np.array([[0., 0., 0.], [0., -1., 1.], [0., 0., 0.]]),
np.array([[0., 0., 0.], [0., -1., 0.], [0., 0., 1.]])]
self.weights = []
for kernel_filter in kernel:
kernel_filter = torch.FloatTensor(kernel_filter).unsqueeze(0).unsqueeze(0) # (1, 1, 3, 3)
kernel_filter = np.repeat(kernel_filter, self.channels, axis=0)
self.weights.append(nn.Parameter(data=kernel_filter, requires_grad=True))
...
for weight in self.weights:
image_filtered = F.conv2d(x, weight, stride=[1, 1], padding=1, groups=self.channels)
After training several times, I checked the parameter values and found that there is no change. May I ask why this is, or what is wrong with my understanding here.

Print out all of the model parameters and ensure that they are registered, I suspect they are not. See here: How to add parameters in module class in pytorch custom model?. In short, just because self.weights is an attribute of the model class doesn't mean the model recognizes it as a learnable parameter. To indicate this to the model, you call self.register_parameter.
(As such, this question contains the same info as the linked question so perhaps should be closed, though I could perhaps see this being a useful avenue for people less familiar with terminology to reach the info.)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

indices in MaxPool2d in pytorch - deep-learning

Related

Hugginface Multi-Class classification using AutoModelForSequenceClassification

spectral_norm on GCNConv module

Horizontal stacking in Pytorch

Dimension mismatch between output mask and original mask in dice loss in Semantic segmentation

The learnable parameter problem of the conv2d layer in pytorch [duplicate]

Categories

Resources