Background
I used Python and Keras to implement the model of [1].
The model's structure is described in Fig.3 of this paper:
[1] Zheng, Y.: Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks, 2014
Problem
The trained model gives predictions of only 1 class out of 4 classes. For example, [3,3,3,...,3] (= all 3's)
My code at Github
Run main_q02.py
The model is defined in mcdcnn_3.py
Utility functions are defined in utils.py and PAMAP2Utils.py
Dataset Download
The code requires only two files:
PAMAP2_Dataset/Protocol/subject101.dat
PAMAP2_Dataset/Protocol/subject102.dat
About the dataset
The dataset classes are NOT balanced.
class: 0, 1, 2, 3
number of samples (%): 28.76%, 36.18%, 18.42%, 16.64%
Note: computed over all 7 subjects
Does one dominate? Classes 0 and 1 dominate around 65% of all samples.
class 0: 28.76%
class 1: 36.18%
Additional details
Operating system: Ubuntu 14.04 LTS
Version of python packages:
Theano (0.8.2)
Keras (1.1.0)
numpy (1.13.0)
pandas (0.20.2)
Details of the model (from the paper):
"separate multivariate time series into univariate ones and perform feature learning on each univariate series individually." [1]
"adopt sigmoid function in all activation layers" [1]
"utilize average pooling without overlapping" [1]
use stochastic gradient descent (SGD) for learning
parameters: momentum = 0.9, decay = 0.0005, learning rate = 0.01
Related
I want to do a 2 class segmentation using dense_vnet model available on niftynet which originally does a 9 class segmentation
I tried to retrain only the last layer by making changes in config file according to this suggestion: HOw to fine tune niftynet pre trained model for custom data
vars_to_restore = ^((?!DenseVNet\/(skip_conv|fin_conv)).)*$
num_classes = 2
error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign
requires shapes of both tensors to match. lhs shape= [2] rhs shape=
[9]
[[{{node save/Assign_8}} = Assign[T=DT_FLOAT, class=["loc:#DenseVNet/conv/conv/b"], use_locking=true, validate_shape=true,
device="/job:localhost/replica:0/task:0/device:CPU:0"](DenseVNet/conv/conv/b,
save/RestoreV2:8)]]
It looks like you have restored too many layers, some of them are still designed to classify to 9 classes. Inspect the architecture and exclude restore for all layers which are designed to classify into 9 classes.
I am using the VGG-16 network available in pytorch out of the box to predict some image index. I found out that for same input file, if i predict multiple time, I get different outcome. This seems counter-intuitive to me. Once the weights are predicted ( since I am using the pretrained model) there should not be any randomness at any step, and hence multiple run with same input file shall return same prediction.
Here is my code:
import torch
import torchvision.models as models
VGG16 = models.vgg16(pretrained=True)
def VGG16_predict(img_path):
transformer = transforms.Compose([transforms.CenterCrop(224),transforms.ToTensor()])
data = transformer(Image.open(img_path))
output = softmax(VGG16(data.unsqueeze(0)), dim=1).argmax().item()
return output # predicted class index
VGG16_predict(image)
Here is the image
Recall that many modules have two states for training vs evaluation: "Some models use modules which have different training and evaluation behavior, such as batch normalization. To switch between these modes, use model.train() or model.eval() as appropriate. See train() or eval() for details." (https://pytorch.org/docs/stable/torchvision/models.html)
In this case, the classifier layers include dropout, which is stochastic during training. Run VGG16.eval() if you want the evaluations to be non-random.
I would like to create a fully convolution network for binary image classification in pytorch that can take dynamic input image sizes, but I don't quite understand conceptually the idea behind changing the final layer from a fully connected layer to a convolution layer. Here and here both state that this is possible by using a 1x1 convolution.
Suppose I have a 16x16x1 image as input to the CNN. After several convolutions, the output is a 16x16x32. If using a fully connected layer, I can produce a single value output by creating 16*16*32 weights and feeding it to a single neuron. What I don't understand is how you would get a single value output by applying a 1x1 convolution. Wouldn't you end up with 16x16x1 output?
Check this link: http://cs231n.github.io/convolutional-networks/#convert
In this case, your convolution layer should be a 16 x 16 filter with 1 output channel. This will convert the 16 x 16 x 32 input into a single output.
Sample code to test:
from keras.layers import Conv2D, Input
from keras.models import Model
import numpy as np
input = Input((16,16,32))
output = Conv2D(1, 16)(input)
model = Model(input, output)
print(model.summary()) # check the output shape
output = model.predict(np.zeros((1, 16, 16, 32))) # check on sample data
print(f'output is {np.squeeze(output)}')
This approach of Fully convolutional networks are useful in segmentation tasks using patch based approaches since you can speed up prediction(inference) by feeding a bigger portion of the image.
For classification tasks, you usually have a fc layer at the end. In that case, a layer like AdaptiveAvgPool2d is used which ensures the fc layer sees a constant input feature size irrespective of the input image size.
https://pytorch.org/docs/stable/nn.html#adaptiveavgpool2d
See this pull request for torchvision VGG: https://github.com/pytorch/vision/pull/747
In case of Keras, GlobalAveragePooling2D. See the example, "Fine-tune InceptionV3 on a new set of classes".
https://keras.io/applications/
I hope you are familier with keras. Now see your image is of 16*16*1. Image will pass to the keras convoloutional layer but first we have to create the model. like model=Sequential() by this we are able to get keras model instance. now we will give our convoloutional layer with our parameters like
model.add(Conv2D(20,(2,2),padding="same"))
now here we are adding 20 filters to our image. and our image becomes 16*16*20 now for more best features we add more conv layers like
model.add(Conv2D(32,(2,2),padding="same"))
now we add 32 filters to your image after this your image will be size of 16*16*32
dont forgot to put activation after conv layers. If you are new than you should study about activations, Optimization and loss of the network. these are the basic part of neural Networks.
Now its time to move towards fully connected layer. First we need to flatten our image because fully connected layer only works on 2d vectors (no_of_ex,image_dim) in your case
imgae diminsion after applying flattening will be (16*16*32)
model.add(Flatten())
after flatening our image your network will give it to fully connected layers
model.add(Dense(32))
model.add(Activation("relu"))
model.add(Dense(8))
model.add(Activation("relu"))
model.add(Dense(2))
because you are having a problem of binary classification if you have to classify 3 classes than last layer will have 3 neuron if you have to classify 10 examples than your last dense layer willh have 10 neuron.
model.add(Activation("softmax"))
model.compile(loss='binary_crossentropy',
optimizer=Adam(),
metrics=['accuracy'])
return model
after this you have to fit this model.
estimator=model()
estimator.fit(X_train,y_train)
full code:
def model (classes):
model=Sequential()
# conv2d set =====> Conv2d====>relu=====>MaxPooling
model.add(Conv2D(20,(5,5),padding="same"))
model.add(Activation("relu"))
model.add(Conv2D(32,(5,5),padding="same"))
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(32))
model.add(Activation("relu"))
model.add(Dense(8))
model.add(Activation("relu"))
model.add(Dense(2))
#now adding Softmax Classifer because we want to classify 10 class
model.add(Dense(classes))
model.add(Activation("softmax"))
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.0001, decay=1e-6),
metrics=['accuracy'])
return model
You can take help from this kernal
I have pairs of movie witch contains 2783 features.
The vector is defined as: if the feature is in the movie it's 1 otherwise it's 0.
Example :
movie 1 = [0,0,1,0,1,0,1 ...] & movie 2 = [1,0,1,1,1,0,1 ...]
Each pair has for label 1 or 0.
movie1,movie2=0
movie1,movie4=1
movie2,movie150=0
The input is similar to SGNS (Skip gram negative sampling) word2vec model.
My goal is to find similarity between programs and learn embedding of each movie.
I'd to build a kind of 'SGNS implementation with keras'. However my input is not one hot and I can't use the Embedding layers. I tried to use Dense layers and merge them with a dot product. I'm not sure about the model architecture and I got errors.
from keras.layers import Dense,Input,LSTM,Reshape
from keras.models import Model,Sequential
n_of_features = 2783
n_embed_dims = 20
# movie1 vectors
word= Sequential()
word.add(Dense(n_embed_dims, input_dim=(n_words,)))
# movie2 vectors
context = Sequential()
context.add(Dense(n_embed_dims, input_dim=n_words,))
model = Sequential()
model.add(keras.layers.dot([word, context], axes=1))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='mean_squared_error')
If someone has an idea how to implement it.
If you're not wedded to Keras, you could probably model this by turning each movie into a synthetic 'document' with tokens for each feature-that-is-present. Then, use a 'Paragraph Vectors' implementation in pure PV-DBOW mode to learn a vector for each movie.
(In pure PV-DBOW, dense doc-vectors are learned to predict each word in a document, without regard to order/word-adjacency/etc. It is a bit like skip-gram, but the training pairs are not 'word to every nearby word' but 'doc-token to every in-doc word'.)
In gensim, the Doc2Vec class with initialization parameter dm=0 uses pure PV-DBOW mode.
i am trying to implement a model that is composed of two layers to segment object candidates in keras
So basically this model has the following architecture
Image(channel,width,height) -> multiple convolution and pooling layers- > output('n' feature maps , height width )
Now this single output is being used by two layers
which are as follows
1) convolution (1*1) - > dense layer with m units (output = n * 1*1 ) - > pixel classifier using fully connected layers of h*w dimesion -> upsmapling to (H,N) - > output
2) convolution -> maxpooling->dense layer - > score
Cost function uses outputs of both these layers which is sum of binary logistic regression of each output
Now I have two questions
1) how to implement dense connection over convoluted output in layer 1 to produce h*w pixel classifier as mentioned above
2) How to merge the two layers to calculate the single cost function and then train both the layers jointly using back-propagation
Can anyone tell me how to create the model for above mentioned network architecture.i am new to deep learning so if there something which i misunderstood i ll appreciate if anyone can explain me the errors in my understanding
Thanks
It's easier when you share the code you already have.
For the transition convolution to dense, you have to use model.add(Flatten()), like in the examples here.
Unfortunately, I don't know for the second question, but according to what I just read in the Keras Models, you have to use the graph model.