I would like to compute, in the most efficient way, the following operation:
cc = nn.Conv1d(1,1,3,1,1,bias=False)
U = torch.randn(1,1,10,10)
V = torch.zeros_like(U)
for i in range(10):
V[:,:,:,i] = cc(U[:,:,:,i])
In other words, I would like to apply a 1d convolution on the columns of the input U. Of course the for loop is too slow and I am sure there is a more efficient way to solve the problem. However, I can not find any idea to achieve that.
You can apply a 2D convolution with a rectangular shape window ie. 1x3:
conv = nn.Conv2d(1,1, kernel_size=(1,3), padding=1, bias=False)
Related
I am beginner in deep learning.
I am using this dataset and I want my network to detect keypoints of a hand.
How can I make my output layer's nodes to be in range [-1, 1] (range of normalized 2D points)?
Another problem is when I train for more than 1 epoch the loss gets negative values
criterion: torch.nn.MultiLabelSoftMarginLoss() and optimizer: torch.optim.SGD()
Here u can find my repo
net = nnModel.Net()
net = net.to(device)
criterion = nn.MultiLabelSoftMarginLoss()
optimizer = optim.SGD(net.parameters(), lr=learning_rate)
lr_scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer=optimizer, gamma=decay_rate)
You can use the Tanh activation function, since the image of the function lies in [-1, 1].
The problem of predicting key-points in an image is more of a regression problem than a classification problem (especially if you're making your model outputs + targets fall within a continuous interval). Therefore, I suggest you use the L2 Loss.
In fact, it could be a good exercise for you to determine which loss function that is appropriate for regression problems provides the lowest expected generalization error using cross-validation. There's several such functions available in PyTorch.
One way I can think of is to use torch.nn.Sigmoid which produces outputs in [0,1] range and scale outputs to [-1,1] using 2*x-1 transformation.
I'm trying to achieve an algorithm to efficiently calculate a combination of convolution and correlation such as following :
c(x,y)=(sum of i, (sum of j, a(x-i,y+j)*b(i,j)))
I have known that 1-D convolution or correlation can be solved by
a conv b = ifft(fft(a).*fft(b))
a corr b = ifft(fft(a).*conjg(fft(b)))
But I have no idea about the combination of them in 2-D or N-D problems. I think it is similar to 2-D convolution, but I don't know the specific deduction process.
The correlation can be written in terms of the convolution by reversing one of the arguments:
corr(x(t),y(t)) = conv(x(t),y(-t))
Thus, if you want the x-axis to behave like a convolution and the y-axis to behave like a correlation, reverse the y-axis only and compute the convolution. It doesn’t matter if you use a spatial or frequency domain implementation.
In my Neural network model, I represent an 8 word-sentence with a 8x256 dimensional embedding matrix. I want to give it to a LSTM as a input where LSTM takes a single word embedding at a time as input and process it. According to pytorch documentation, the input should be in the shape of (seq_len, batch, input_size). What is the correct way to convert my input to desired shape ? I don't want to mixup the numbers by mistake. I am quite new in PyTorch and row-major calculations, therefore I wanted to ask it here. I do it as follows, is it correct ?
x = torch.rand(8,256)
lstm_input = torch.reshape(x,(8,1,256))
Your solution is correct: you added a Singleton dimension for the "batch" dimension, leaving x to be with temporal dimension 8 and input dimension 256.
Since you are new to pytorch, here are a few equivalent ways of doing the same thing:
x = x[:, None, :]
Putting None in the dim=1 indicates to pytorch to add a singelton dimension.
Another way is to use view:
x = x.view(8, 1, 256)
I am finetuning resnet on my dataset which has multiple labels.
I would like to convert the 'scores' of the classification layer to probabilities and use those probabilities to calculate the loss at the training.
Could you give an example code for this?
Can I use like this:
P = net.forward(x)
p = torch.nn.functional.softmax(P, dim=1)
loss = torch.nn.functional.cross_entropy(P, y)
I am unclear whether this is the correct way or not as I am passing probabilities as the input to crossentropy loss.
So, you are training a model i.e resnet with cross-entropy in pytorch. Your loss calculation would look like this.
logit = model(x)
loss = torch.nn.functional.cross_entropy(logits=logit, target=y)
In this case, you can calculate the probabilities of all classes by doing,
logit = model(x)
p = torch.nn.functional.softmax(logit, dim=1)
# to calculate loss using probabilities you can do below
loss = torch.nn.functional.nll_loss(torch.log(p), y)
Note that if you use probabilities you will have to manually take a log, which is bad for numerical reasons. Instead, either use log_softmax or cross_entropy in which case you may end up computing losses using cross entropy and computing probability separately.
I am trying to implement discriminant condition codes in Keras as proposed in
Xue, Shaofei, et al., "Fast adaptation of deep neural network based
on discriminant codes for speech recognition."
The main idea is you encode each condition as an input parameter and let the network learn dependency between the condition and the feature-label mapping. On a new dataset instead of adapting the entire network you just tune these weights using backprop. For example say my network looks like this
X ---->|----|
|DNN |----> Y
Z --- >|----|
X: features Y: labels Z:condition codes
Now given a pretrained DNN, and X',Y' on a new dataset I am trying to estimate the Z' using backprop that will minimize prediction error on Y'. The math seems straightforward except I am not sure how to implement this in keras without having access to the backprop itself.
For instance, can I add an Input() layer with trainable=True with all other layers set to trainable= False. Can backprop in keras update more than just layer weights? Or is there a way to hack keras layers to do this?
Any suggestions welcome.
thanks
I figured out how to do this (exactly) in Keras by looking at fchollet's post here
Using the keras backend I was able to compute the gradient of my loss w.r.t to Z directly and used it to drive the update.
Code below:
import keras.backend as K
import numpy as np
model.summary() #Pretrained model
loss = K.categorical_crossentropy(Y, Y_out)
grads = K.gradients(loss, Z)
grads /= (K.sqrt(K.mean(K.square(grads)))+ 1e-5)
iterate = K.function([X,Z],[loss,grads])
step = 0.1
Z_adapt = Z_in.copy()
for i in range(100):
loss_val, grads_val = iterate([X_in,Z_adapt])
Z_adapt -= grads_val[0] * step
print "iter:",i,np.mean(loss_value)
print "Before:"
print model.evaluate([X_in, Z_in],Y_out)
print "After:"
print model.evaluate([X_in, Z_adapt],Y_out)
X,Y,Z are nodes in the model graph. Z_in is an initial value for Z'. I set it to an average value from the train set. Z_adapt is after 100 iterations of gradient descent and should give you a better result.
Assume that the size of Z is m x n. Then you can first define an input layer of size m * n x 1. The input will be an m * n x 1 vector of ones. You can define a dense layer containing m * n neurons and set trainable = True for it. The response of this layer will give you a flattened version of Z. Reshape it appropriately and give it as input to the rest of the network that can be appended ahead of this.
Keep in mind that if the size of Z is too large, then network may not be able to learn a dense layer of that many neurons. In that case, maybe you need to put additional constraints or look into convolutional layers. However, convolutional layers will put some constraints on Z.