I am trying to classify malicious url based on numeric features,the shape of my training dataset is (123036,16),
how do I setup my first layer of CNN.
layers.Conv1D(filters=64, kernel_size=3, activation='relu',input_shape=[16]),
Related
I am trying to reproduce a Neural Network trained to detect whether there is a 0-3 digit in an image with another confounding image. The paper I am following lists the corresponding architecture:
A neural network with 28×56 input neurons and one output neuron is
trained on this task. The input values are coded between −0.5 (black)
and +1.5 (white). The neural network is composed of a first detection
pooling layer with 400 detection neurons sum-pooled into 100 units
(i.e. we sum-pool non-overlapping groups of 4 detection units). A
second detection-pooling layer with 400 detection neurons is applied
to the 100-dimensional output of the previous layer, and activities
are sum-pooled onto a single unit representing the deep network
output. Positive examples (0-3 digit in the image) are assigned target
value 100 and negative examples are assigned target value 0. The
neural network is trained to minimize the mean-square error between
the target values and its output.
My main doubt is in this context what they mean by detection neurons, if they mean filters or a single standard ReLU neuron. Also, if the mean filters, how could they be applied in the second layer to a 100-dimensional output when they are designed to operate on 2x2 matrixes.
Reference:
Montavon, G., Bach, S., Binder, A., Samek, W., & Müller, K. (2015).
Explaining NonLinear Classification Decisions with Deep Taylor
Decomposition. arXiv. https://doi.org/10.1016/j.patcog.2016.11.008.
Specifically section 4.C
Thanks a lot for the help!
My best guess at this is something like (code not tested - just rough PyTorch):
from torch import nn
class Model(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Sequential(
nn.Flatten(), # Flatten row-wise into a 1D sequence
nn.Linear(28 * 56, 400), # Linear layer with 400 outputs.
nn.AvgPool1D(4, 4), # Sum pool to 100 outputs.
)
self.layer2 = nn.Sequential(
nn.Linear(100, 400), # Linear layer with 400 outputs.
nn.AdaptiveAvgPool1D(1), # Sum pool to 1 output.
)
def forward(self, x):
return self.layer2(self.layer1(x))
But overall I would agree with the commentor on your post that there are some issues with language here.
I want to implement lstms with CNN in pytorch as my data is a time series data i.e. frames of video for heart rate detection, I am struggling with the input and output dimensions for lstms what and how i should properly configure the dimensions/parameters/arguments at input of lstms in pytorch as its quite confusing when considering time steps, hidden state etc.
my output from CNN is “2 batches of 256 frames”, which is now the input to lstms
batch is 2
features =256
the output is also a batch of 2 with 256 frames.
Generally, the input shape of sequential data takes the form (batch_size, seq_len, num_features). Based on your explanation, I assume your input is of the form (2, 256), where 2 is the batch size and 256 is the sequence length of scalars (1-dimensional tensor). Therefore, you should reshape your input to be (2, 256, 1) by inputs.unsqueeze(2).
To declare and use an LSTM model, simply try
from torch import nn
model = nn.LSTM(
input_size=1, # 1-dimensional features
batch_first=True, # batch is the first (zero-th) dimension
hidden_size=some_hidden_size, # maybe 64, 128, etc.
num_layers=some_num_layers, # maybe 1 or 2
proj_size=1, # output should also be 1-dimensional
)
outputs, (hidden_state, cell_state) = model(inputs)
Simple and short question. I have a network (Unet) which performs image segmentation. I want the logits as the output to feed into the cross entropy loss (using pytorch). Currently my final layer looks as so:
class Logits(nn.Sequential):
def __init__(self,
in_channels,
n_class
):
super(Logits, self).__init__()
# fully connected layer outputting the prediction layers for each of my classes
self.conv = self.add_module('conv_out',
nn.Conv2d(in_channels,
n_class,
kernel_size = 1
)
)
self.activ = self.add_module('sigmoid_out',
nn.Sigmoid()
)
Is it correct to use the sigmoid activation function here? Does this give me logits?
When people talk about "logits" they usually refer to the "raw" n_class-dimensional output vector. For multi-class classification (n_class > 2) you want to convert the n_class-dimensional vector of raw "logits" into a n_class-dim probability vector.
That is, you want prob = f(logits) with prob_i >= 0 for all n_class entries, and that sum(prob)=1.
The most straight forward way of doing that in a differentiable way is to use the Softmax function:
prob_i = softmax(logits) = exp(logits_i) / sum_j exp(logits_j)
It is easy to see that the output of softmax is indeed a n_class-dim probability vector (I leave it to you as a short exercise).
BTW, this is why the raw predictions are called "logits" because they are kind of "log" of the output predicted probabilities.
Now, it is customary not to explicitly compute the softmax on top of a classification network and defer its computation to the loss function, e.g. nn.CrossEntropyLoss that internally computes the softmax and requires the raw logits as inputs, rather than the normalized probabilities. This is done mainly for numerical stability.
Therefore, if you are training a multi-class classification network with nn.CrossEntropyLoss you do not need to worry at all about the final activation and simply output the raw logits from your final conv/linear layer.
Most importantly, do not use nn.Sigmoid() activation as it tends to have saturated gradients and will mess up your training.
As far as I understood, you are working on a multi-label classification task where a single input can have several labels, hence your usage of nn.Sigmoid (vs nn.Softmax for multi-class classification).
There a loss function which combines nn.Sigmoid and the nn.BCELoss: nn.BCEWithLogitsLoss. So you would have as input, a vector of logits whose length is the number of classes. And, the target would as well have the same shape: as a multi-hot-encoding, with 1s for active classes.
I created Convolutional Autoencoder using Pytorch and I'm trying to improve it.
For the encoding layer I use first 4 layers of pre-trained ResNet 18 model from torchvision.models.resnet.
I have mid-layer with just one Convolutional layer with input and output channel sizes of 512. For the decoding layer I use Convolutional layers following with BatchNorm and ReLU activation function.
The decoding layer reduces the channel each layer: 512 -> 256 -> 128 -> 64 -> 32 -> 16 -> 3 and increases the resolution of the image with interpolation to match the dimension of the corresponding layer in the encoding part. For the last layer I use sigmoid instead of ReLu.
All Convolutional layers are:
self.up = nn.Sequential(
nn.Conv2d(input_channels, output_channels,
kernel_size=5, stride=1,
padding=2, bias=False),
nn.BatchNorm2d(output_channels),
nn.ReLU()
)
The input images are scaled to [0, 1] range and have shapes 224x224x3. Sample outputs are (First is from training set, the second from the test set):
First image
First image output
Second image
Second image output
Any ideas why output is blurry? The provided model has been trained around 160 epochs with ~16000 images using Adam optimizer with lr=0.00005. I'm thinking about adding one more Convolutional layer in self.up given above. This will increase complexity of the model, but I'm not sure if it is the right way to improve the model.
I would like to create a fully convolution network for binary image classification in pytorch that can take dynamic input image sizes, but I don't quite understand conceptually the idea behind changing the final layer from a fully connected layer to a convolution layer. Here and here both state that this is possible by using a 1x1 convolution.
Suppose I have a 16x16x1 image as input to the CNN. After several convolutions, the output is a 16x16x32. If using a fully connected layer, I can produce a single value output by creating 16*16*32 weights and feeding it to a single neuron. What I don't understand is how you would get a single value output by applying a 1x1 convolution. Wouldn't you end up with 16x16x1 output?
Check this link: http://cs231n.github.io/convolutional-networks/#convert
In this case, your convolution layer should be a 16 x 16 filter with 1 output channel. This will convert the 16 x 16 x 32 input into a single output.
Sample code to test:
from keras.layers import Conv2D, Input
from keras.models import Model
import numpy as np
input = Input((16,16,32))
output = Conv2D(1, 16)(input)
model = Model(input, output)
print(model.summary()) # check the output shape
output = model.predict(np.zeros((1, 16, 16, 32))) # check on sample data
print(f'output is {np.squeeze(output)}')
This approach of Fully convolutional networks are useful in segmentation tasks using patch based approaches since you can speed up prediction(inference) by feeding a bigger portion of the image.
For classification tasks, you usually have a fc layer at the end. In that case, a layer like AdaptiveAvgPool2d is used which ensures the fc layer sees a constant input feature size irrespective of the input image size.
https://pytorch.org/docs/stable/nn.html#adaptiveavgpool2d
See this pull request for torchvision VGG: https://github.com/pytorch/vision/pull/747
In case of Keras, GlobalAveragePooling2D. See the example, "Fine-tune InceptionV3 on a new set of classes".
https://keras.io/applications/
I hope you are familier with keras. Now see your image is of 16*16*1. Image will pass to the keras convoloutional layer but first we have to create the model. like model=Sequential() by this we are able to get keras model instance. now we will give our convoloutional layer with our parameters like
model.add(Conv2D(20,(2,2),padding="same"))
now here we are adding 20 filters to our image. and our image becomes 16*16*20 now for more best features we add more conv layers like
model.add(Conv2D(32,(2,2),padding="same"))
now we add 32 filters to your image after this your image will be size of 16*16*32
dont forgot to put activation after conv layers. If you are new than you should study about activations, Optimization and loss of the network. these are the basic part of neural Networks.
Now its time to move towards fully connected layer. First we need to flatten our image because fully connected layer only works on 2d vectors (no_of_ex,image_dim) in your case
imgae diminsion after applying flattening will be (16*16*32)
model.add(Flatten())
after flatening our image your network will give it to fully connected layers
model.add(Dense(32))
model.add(Activation("relu"))
model.add(Dense(8))
model.add(Activation("relu"))
model.add(Dense(2))
because you are having a problem of binary classification if you have to classify 3 classes than last layer will have 3 neuron if you have to classify 10 examples than your last dense layer willh have 10 neuron.
model.add(Activation("softmax"))
model.compile(loss='binary_crossentropy',
optimizer=Adam(),
metrics=['accuracy'])
return model
after this you have to fit this model.
estimator=model()
estimator.fit(X_train,y_train)
full code:
def model (classes):
model=Sequential()
# conv2d set =====> Conv2d====>relu=====>MaxPooling
model.add(Conv2D(20,(5,5),padding="same"))
model.add(Activation("relu"))
model.add(Conv2D(32,(5,5),padding="same"))
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(32))
model.add(Activation("relu"))
model.add(Dense(8))
model.add(Activation("relu"))
model.add(Dense(2))
#now adding Softmax Classifer because we want to classify 10 class
model.add(Dense(classes))
model.add(Activation("softmax"))
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.0001, decay=1e-6),
metrics=['accuracy'])
return model
You can take help from this kernal