While coding a GAN and I encountered an error saying `'NoneType' object is not callable`. Please explain this error and some possible solutions? - deep-learning

I was trying to create a Generative Adverserial Network using PyTorch. I coded the discriminator block and printed the summary. After that, I moved to create Generator block. I defined forward() function and reshaped the input noise dimensions from (batch_size, noise_dim) to (batch_size, channel, height, width). While running the code for getting summary, the error popped saying 'NoneType' object is not callable. I searched stackoverflow and other places but my problem didn't resolved.
I first created a generator block function with the following code:
def get_gen_block(in_channels, out_channels, kernel_size, stride, final_block = False):
if final_block == True:
return nn.Sequential(
nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride),
nn.Tanh()
)
return nn.Sequential(
nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride),
nn.BatchNorm2d(out_channels),
nn.ReLU()
)
Then I defined a class for generator block to create several class and defined forward() function linke this:
class Generator(nn.Module):
def __init__(self, noise_dim):
super(Generator, self).__init__()
self.noise_dim = noise_dim
self.block_1 = get_gen_block(noise_dim, 256, (3, 3), 2)
self.block_2 = get_gen_block(256, 128, (4, 4), 1)
self.block_3 = get_gen_block(128, 64, (3, 3), 2)
self.block_4 = get_gen_block(64, 1, (4, 4), 2, final_block=True)
def forward(self, r_noise_vec):
x = r_noise_vec.view(-1, self.noise_dim, 1, 1)
x1 = self.block_1(x)
x2 = self.block_2(x1)
x3 = self.block_3(x2)
x4 = self.block_4(x3)
return x4
After this, when I was printing summary for the generator, this error occured pointing to the line 'x1 = self.block_1(x)' saying 'NoneType' object is not callable. Anyone please help me in resolving this issue.

Please check your get_gen_block function, looks like you missed else: branch or messed up the indentation and when final_block = False it returns None instead of
return nn.Sequential(
nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride),
nn.BatchNorm2d(out_channels),
nn.ReLU()
)
if cond:
return module1
return module2
Always returns module1 when condition is met, otherwise None.
I think you wanted this
if cond:
return module1
return module2
when condition is met return module1 otherwise module2. and now compare the indentation.

Related

how to best visualize CNN architecture? (experience using "PlotNeuralNet")

I'm writing a thesis and want to present a visualisation of the CNN architecture used for the analysis (written in PyTorch). I came across this cool repository PlotNeuralNet with examples for how to generate LaTeX code for drawing neural networks for reports and presentation. However, I'm having trouble finding out how to exactly define my particular architecture.
Here is an example of how one would define an architecture.
import sys
sys.path.append('../')
from pycore.tikzeng import *
# define your arch
arch = \[
to_head( '..' ),
to_cor(),
to_begin(),
to_Conv("conv1", 512, 64, offset="(0,0,0)", to="(0,0,0)", height=64, depth=64, width=2 ),
to_Pool("pool1", offset="(0,0,0)", to="(conv1-east)"),
to_Conv("conv2", 128, 64, offset="(1,0,0)", to="(pool1-east)", height=32, depth=32, width=2 ),
to_connection( "pool1", "conv2"),
to_Pool("pool2", offset="(0,0,0)", to="(conv2-east)", height=28, depth=28, width=1),
to_SoftMax("soft1", 10 ,"(3,0,0)", "(pool1-east)", caption="SOFT" ),
to_connection("pool2", "soft1"),
to_Sum("sum1", offset="(1.5,0,0)", to="(soft1-east)", radius=2.5, opacity=0.6),
to_connection("soft1", "sum1"),
to_end()
\]
def main():
namefile = str(sys.argv[0]).split('.')[0]
to_generate(arch, namefile + '.tex' )
if __name__ == '__main__':
main()
However, looking at the different available blocks available in pycore module, I'm still not able to use the tool. Documentation for usage is not really that elaborate, so I was hoping someone here would find it trivial to define the architecture below. Else, any good ways to
class Net20(nn.Module):
""" CNN for 20-day Image
This particular model should have:
- 3 blocks
- 64 layers in first block, multiply by 2 each subsequent block
- filter size (5,3)
- vertical stride = 3 (but only in first layer)
- vertical dilation = 2 (but only in first layer)
- Leaky Relu activation function
- max pooling (2,1) at the end of each block
"""
def __init__(self):
super().__init__()
self.layer1 = nn.Sequential(
Conv2dSame(1, 64, kernel_size=(5,3), stride=(3,1), dilation=(2,1)),
nn.BatchNorm2d(64),
nn.LeakyReLU(negative_slope=0.01, inplace=True),
nn.MaxPool2d((2, 1), ceil_mode=True)
)
self.layer2 = nn.Sequential(
Conv2dSame(64, 128, kernel_size=(5,3)),
nn.BatchNorm2d(128),
nn.LeakyReLU(negative_slope=0.01, inplace=True),
nn.MaxPool2d((2, 1), ceil_mode=True)
)
self.layer3 = nn.Sequential(
Conv2dSame(128, 256, kernel_size=(5,3)),
nn.BatchNorm2d(256),
nn.LeakyReLU(negative_slope=0.01, inplace=True),
nn.MaxPool2d((2, 1), ceil_mode=True)
)
self.fc1 = nn.Sequential(
nn.Dropout(p=0.5),
nn.Linear(46080, 1),
)
def forward(self, x):
x = x.reshape(-1,1,64,60)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = x.reshape(-1,46080)
x = self.fc1(x)
return x
You can try model.summary() or keras.utils.plot_model. You may want to check: https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/

How to make multi-inputs and multi-outputs neural network model

I converted the following code from Keras to Pytorch. The main challenge here for me is to make multi-inputs and multi-outputs model similar to keras.models.Model. As how to implement the following code, in Pytorch, to accept the multi input and outputs.
from tensorflow import keras as k
import tensorflow as tf
class NetworkKeys:
NUM_UNITS = "num_units"
ACTIVATION = "activation"
L2_REG_FACT = "l2_reg_fact"
DROP_PROB = "drop_prob"
BATCH_NORM = "batch_norm"
def build_dense_network(input_dim, output_dim,
output_activation, params, with_output_layer=True):
model = k.models.Sequential()
activation = params.get(NetworkKeys.ACTIVATION, "relu")
l2_reg_fact = params.get(NetworkKeys.L2_REG_FACT, 0.0)
regularizer = k.regularizers.l2(l2_reg_fact) if l2_reg_fact > 0 else None
drop_prob = params.get(NetworkKeys.DROP_PROB, 0.0)
batch_norm = params.get(NetworkKeys.BATCH_NORM, False)
last_dim = input_dim
for i in range(len(params[NetworkKeys.NUM_UNITS])):
model.add(k.layers.Dense(units=params[NetworkKeys.NUM_UNITS][i],
kernel_regularizer=regularizer,
input_dim=last_dim))
if batch_norm:
model.add(k.layers.BatchNormalization())
model.add(k.layers.Activation(activation))
last_dim = params[NetworkKeys.NUM_UNITS][i]
if drop_prob > 0.0:
model.add(k.layers.Dropout(rate=drop_prob))
if with_output_layer:
model.add(k.layers.Dense(units=output_dim, activation=output_activation))
return model
ldre_net = build_dense_network(input_dim=input_dim, output_dim=1,
output_activation=k.activations.linear,
params=hidden_params)
p_samples = k.layers.Input(shape=(input_dim,))
q_samples = k.layers.Input(shape=(input_dim,))
train_model = k.models.Model(inputs=[p_samples, q_samples],
outputs=[ldre_net(p_samples),ldre_net(q_samples)])
Here is my attempt to convert the above code to Pytorch code:
def l2_penalty(model, l2_lambda=0.001):
"""Returns the L2 penalty of the params."""
l2_norm = sum(p.pow(2).sum() for p in model.parameters())
return l2_lambda*l2_norm
def build_dense_network(input_dim, output_dim,
output_activation, params, with_output_layer=True):
activation = params.get(NetworkKeys.ACTIVATION, "relu")
l2_reg_fact = params.get(NetworkKeys.L2_REG_FACT, 0.0)
drop_prob = params.get(NetworkKeys.DROP_PROB, 0.0)
batch_norm = params.get(NetworkKeys.BATCH_NORM, False)
layers=[]
last_dim = input_dim
for i in range(len(params[NetworkKeys.NUM_UNITS])):
layers.append(nn.Linear(last_dim,params[NetworkKeys.NUM_UNITS][i]))
if batch_norm:
layers.append(torch.nn.BatchNorm1d(params[NetworkKeys.NUM_UNITS][i]))
if activation=="relu":
layers.append(nn.ReLU())
elif activation=="LeakyRelu":
layers.append(nn.LeakyReLU(0.1,inplace=True))
else:
pass
last_dim = params[NetworkKeys.NUM_UNITS][i]
if drop_prob > 0.0:
layers.append(torch.nn.Dropout(p=drop_prob))
if with_output_layer:
layers.append(nn.Linear(params[NetworkKeys.NUM_UNITS][-1],output_dim))
model = nn.Sequential(*layers)
regularizer = l2_penalty(model, l2_lambda=0.001) if l2_reg_fact > 0 else None
return model, regularizer
class Split(torch.nn.Module):
def __init__(self, module, n_parts: int, dim=1):
super().__init__()
self._n_parts = n_parts
self._dim = dim
self._module = module
def forward(self, inputs):
output = self._module(inputs)
chunk_size = output.shape[self._dim] // self._n_parts
return torch.split(output, chunk_size, dim=self._dim)
class Net(nn.Module):
def __init__(self, hidden_params, input_dim):
self._ldre_net, ldre_regularizer = build_dense_network(input_dim=input_dim,
output_dim=1,output_activation="linear", params=hidden_params)
self._p_samples = nn.Linear(input_dim,input_dim)
self._q_samples = nn.Linear(input_dim,input_dim)
self._split_layers = Split(
self._ldre_net,
n_parts=2,
dim = 0
)
def forward(self, x, inTrain=True):
if inTrain:
p = self._p_samples(x)
q = self._q_samples(x)
p = x[:, 0, :]
q = x[:, 1, :]
combined = torch.cat((p.view(p.size(0), -1),
q.view(q.size(0), -1)), dim=0)
p_output, q_output =self._split_layers(combined)
return p_output, q_output
else:
return self._ldre_net(x)
I am wondering whether my implementation in the Net class is correct or not?
TLDR You control the number of inputs and outputs in PyTorch, in the form of a tensor (or a number of variables). Missing super initialization and the order of operations should be fixed. Also don't particularly like the way arguments are passed, recommend using *args and **kwargs.
Explanation
There were a few things for me to make it run, namely the parameters NetworkKeys are used to access the dictionary that is passed through. Seems like an overly complicated way to do things, as you tried to make default values, but in the end, it threw exceptions if there are none (namely num_units). Recommend just using args and kwargs and passing the dictionary as a parameter. Tried with the following example:
values = {NetworkKeys.BATCH_NORM: False,
NetworkKeys.L2_REG_FACT: 0.0,
NetworkKeys.DROP_PROB: 0.0,
NetworkKeys.ACTIVATION: "relu",
NetworkKeys.NUM_UNITS: [10, 10]
}
print(values)
Net(values, 10)
There were a few things to fix in the Net class
Needs initialization of super (e.g. super(Net, self).__init__())
Order of the forward pass didn't make sense, you are overriding the output of the linear layer, see that we are doing self_p_samples(p) now which is one of the dimensions p = x[:, 0, :].
class Net(nn.Module):
def __init__(self, hidden_params, input_dim):
super(Net, self).__init__()
self._ldre_net, ldre_regularizer = build_dense_network(input_dim=input_dim,
output_dim=1,output_activation="linear", params=hidden_params)
self._p_samples = nn.Linear(input_dim,input_dim)
self._q_samples = nn.Linear(input_dim,input_dim)
self._split_layers = Split(
self._ldre_net,
n_parts=2,
dim = 0
)
def forward(self, x, inTrain=True):
if inTrain:
p = x[:, 0, :]
q = x[:, 1, :]
p = self._p_samples(p)
q = self._q_samples(q)
combined = torch.cat((p.view(p.size(0), -1),
q.view(q.size(0), -1)), dim=0)
p_output, q_output =self._split_layers(combined)
return p_output, q_output
else:
return self._ldre_net(x)
While displaying the network got with a successful forward pass with input size of torch.randn((1,2,10)):
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
Net -- --
├─Linear: 1-1 [1, 10] 110
├─Linear: 1-2 [1, 10] 110
├─Split: 1-3 [1, 1] --
├─Sequential: 1-4 [2, 1] --
│ └─Linear: 2-1 [2, 10] 110
│ └─ReLU: 2-2 [2, 10] --
│ └─Linear: 2-3 [2, 10] 110
│ └─ReLU: 2-4 [2, 10] --
│ └─Linear: 2-5 [2, 1] 11
==========================================================================================
Total params: 451
Trainable params: 451
Non-trainable params: 0
Total mult-adds (M): 0.00
==========================================================================================
Example output, will be in the form of:
(tensor([[-0.0699]], grad_fn=<SplitBackward0>),
tensor([[0.0394]], grad_fn=<SplitBackward0>))
Note: I didn't try to overfit this model (which you should do) to validate that it indeed can learn what you want.
Also a side note, if you really wanted multiple outputs for auxiliary which aren't part of tensor and you have to compute, you can just do return x,y in the forward pass

Error while creating Q Network for tf agent in a 2D-grid environment

I am trying to create a Custom PyEnvironment for making a 2D grid (5*6 dimensions) to make the agent learn the optimum path from top left corner to bottom right corner.
I am facing this error :
ValueError: Expected q_network to emit a floating point tensor with inner dims (4,); but saw network output spec: TensorSpec(shape=(5, 4), dtype=tf.float32, name=None)
Below is the code of Environment class:
class MyEnvironment(py_environment.PyEnvironment):
def __init__(self):
super().__init__()
self._action_spec = array_spec.BoundedArraySpec(
shape=(), dtype=np.int32, name="action", minimum=0, maximum=3)
self._state_spec = array_spec.BoundedArraySpec(
shape=(5, 6), dtype=np.int32, name="observation", minimum=0, maximum=1)
self.discount = 0.99
def action_spec(self):
return self._action_spec
def state_spec(self):
return self._state_spec
def observation_spec(self):
return self._state_spec
def _reset(self):
self._state = np.zeros(2, dtype=np.int32)
obs = np.zeros((5, 6), dtype=np.int32)
obs[self._state[0], self._state[1]] = 1
return ts.restart(obs)
def _step(self, action):
self._state += [(-1, 0), (+1, 0), (0, -1), (0, +1)][action]
reward = 0
done = False
obs = np.zeros((5, 6), dtype=np.int32)
if(self._state[0]<0 or self._state[0]>4 or self._state[1]>5 or self._state[1]<0):
done = True
if not done:
obs[self._state[0], self._state[1]] = 1
if done or np.all(self._state == np.array([4,5])):
reward = -1 if done else +10
return ts.termination(obs, reward)
else:
return ts.transition(obs, reward, self.discount)
Qnetwork :
action_tensor_spec = tensor_spec.from_spec(env.action_spec())
num_actions = 4
# Define a helper function to create Dense layers configured with the right
# activation and kernel initializer.
def dense_layer(num_units):
return tf.keras.layers.Dense(
num_units,
activation=tf.keras.activations.relu,
kernel_initializer=tf.keras.initializers.VarianceScaling(
scale=2.0, mode='fan_in', distribution='truncated_normal'))
# QNetwork consists of a sequence of Dense layers followed by a dense layer
# with `num_actions` units to generate one q_value per available action as
# it's output.
dense_layers = [dense_layer(num_units) for num_units in fc_layer_params]
q_values_layer = tf.keras.layers.Dense(
num_actions,
activation=None,
kernel_initializer=tf.keras.initializers.RandomUniform(
minval=-0.03, maxval=0.03),
bias_initializer=tf.keras.initializers.Constant(-0.2))
q_net = sequential.Sequential(dense_layers + [q_values_layer])
**Error after running this : **
#ValueError: Expected q_network to emit a floating point tensor with inner dims (4,); but saw network output spec: TensorSpec(shape=(5, 4), dtype=tf.float32, name=None)

Gradio - Pytorch MNIST Digit Recognizer

I watched the following video on YouTube https://www.youtube.com/watch?v=jx9iyQZhSwI where it was shown that it is possible to use Gradio and the learned model of MNIST dataset in Tensorflow. I have read and written that it is possible to use Pytorch in Gradio, but I have problems with its implementation. Does anyone have an idea how to do this?
My Pytorch code of cnn
import torch.nn as nn
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(
in_channels=1,
out_channels=16,
kernel_size=5,
stride=1,
padding=2,
),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
self.conv2 = nn.Sequential(
nn.Conv2d(16, 32, 5, 1, 2),
nn.ReLU(),
nn.MaxPool2d(2),
)
# fully connected layer, output 10 classes
self.out = nn.Linear(32 * 7 * 7, 10)
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
# flatten the output of conv2 to (batch_size, 32 * 7 * 7)
x = x.view(x.size(0), -1)
output = self.out(x)
return output, x # return x for visualization
By watching I find that I need to change function that Gradio use
def predict_image(img):
img_3d=img.reshape(-1,28,28)
im_resize=img_3d/255.0
prediction=CNN(im_resize)
pred=np.argmax(prediction)
return pred
Im sorry if I got your question wrong, but from what I understand you are getting an error when trying to predict the digit using your function predict image.
So here are two possible hints. Maybe you have implemented them already, but I don't know because of the very small code snippet.
First of all. Have you set your model into evaluation mode using
CNN.eval()
Do after you finished training your model and want to evaluate inputs without training the model.
Second of all, maybe you need to add a fourth dimension to your input tensor "im_resize". Normally your model expects a dimension for the number of channels, the batch size, the height and the width of your input.
In addition I can not tell if your input is a of the datatype torch.tensor . If not transform your array into a tensor first.
You can add a batch dimension to your input tensor by using
im_resize = im_resize.unsqueeze(0)
I hope that I understand your question correctly and was able to help you.

Luong Style Attention Mechanism with Dot and General scoring functions in keras and tensorflow

I am trying to implement the dot product and general implementation of calculating similarity scores from encoder and decoder output and hidden states respectively in keras.
I have got the idea to do the product of tf.keras.layers.dot(encoder_output,decoder_state) for calculating product score but there is error in multiplication of these two values.
class Attention(tf.keras.Model):
def __init__(self,units):
super().__init__()
self.units = units
def call(self, decoder_state, encoder_output):
score = tf.keras.layers.dot([encoder_output,decoder_state], axes=[2, 1])
attention_weights = tf.nn.softmax(score, axis=1)
context_vector = attention_weights * encoder_output
context_vector = tf.reduce_sum(context_vector, axis=1)
return context_vector, attention_weights
batch_size = 16
units = 32
input_length = 20
decoder_state = tf.random.uniform(shape=[batch_size, units])
encoder_output = tf.random.uniform(shape=[batch_size, input_length, units])
attention = Attention(units)
context_vector, attention_weights = attention(decoder_state, encoder_output)
I am getting the following error:
/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)
InvalidArgumentError: Incompatible shapes: [16,20] vs. [16,20,32] [Op:Mul]
It is a very simple fix but as I am new to this I am not able to get the exact method needed to be called here.
I have tried reshaping the values of encoder_output but still this does not work.
Request to help me fix this.
I am just putting #Ayush Srivastava's comment as a response so that the post gets an answer.
Basically, the error occurs because you are trying to multiply 2 tensors (namely attention_weights and encoder_output) with different shapes, so you need to reshape the decoder_state.
Here is the full answer:
class Attention(tf.keras.Model):
def __init__(self,units):
super().__init__()
self.units = units
def call(self, decoder_state, encoder_output):
decoder_state = tf.keras.layers.Reshape((decoder_state.shape[1], 1))(decoder_state)
score = tf.keras.layers.dot([encoder_output, decoder_state],[2, 1])
attention_weights = tf.nn.softmax(score, axis=1)
context_vector = attention_weights * encoder_output
context_vector = tf.reduce_sum(context_vector, axis=1)
return context_vector, attention_weights
Shapes:
decoder_state before reshape: (16, 32)
decoder_state after reshape: (16, 32, 1)
enc_output: (16, 20, 32)
score: (16, 20, 1)
attention_weights: (16, 20, 1)
context_vector before sum: (16, 20, 32)