Aggregation by MLP for GIN and GCN: What is the difference? - deep-learning

I saw the following procedure for GIN in this link
and the code for a GIN layer is written like this:
self.conv1 = GINConv(Sequential(Linear(num_node_features,dim_h),
BatchNorm1d(dim_h),ReLU(),
Linear(dim_h,dim_h),ReLU()))
Is this an aggregation function inside the Sequential(....) or a pooling function?
Sequential(Linear(num_node_features,dim_h),
BatchNorm1d(dim_h),ReLU(),
Linear(dim_h,dim_h),ReLU()))
Can I do the same thing for GCN layer?
self.conv1 = GCNConv(Sequential(Linear(num_node_features,dim_h),
BatchNorm1d(dim_h),ReLU(),
Linear(dim_h,dim_h),ReLU()))
self.conv2 = GCNConv(Sequential(Linear(dim_h,dim_h),
BatchNorm1d(dim_h),ReLU(),
Linear(dim_h,dim_h),ReLU()))
I get the following error:
---> 15 self.conv1 = GCNConv(Sequential(Linear(num_node_features,dim_h),
16 BatchNorm1d(dim_h),ReLU(),
17 Linear(dim_h,dim_h),ReLU()))
18 self.conv2 = GCNConv(Sequential(Linear(dim_h,dim_h),
19 BatchNorm1d(dim_h),ReLU(),
20 Linear(dim_h,dim_h),ReLU()))
21 self.conv3 = GCNConv(Sequential(Linear(dim_h,dim_h),
22 BatchNorm1d(dim_h),ReLU(),
23 Linear(dim_h,dim_h),ReLU()))
TypeError: GCNConv.__init__() missing 1 required positional argument: 'out_channels'

You can see GINConv and GCNConv API from torch_geometric.
GINConv()
There is a nn argument e.g., defined by torch.nn.Sequential. Therefore in the tutorial you mentioned above can use the Sequential() method.
GCNConv()
But GCNConv() does not have nn argument.
When you wonder about a method you don't know, searching for the method in API is a good way to solve issues :)

Related

ValueError: Layer count mismatch when loading weights from file. Model expected 241 layers, found

I'm trying to create the base pre-trained model
I did use the following code:
base_model = DenseNet121(weights='/Users/awabe/Desktop/Project/PapilaDB/ClinicalData/densenet121_weights_tf_dim_ordering_tf_kernels.h5', include_top=False)
x = base_model.output
# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(x)
# and a logistic layer
predictions = Dense(len(labels), activation="sigmoid")(x)
model = Model(inputs=base_model.input, outputs=predictions)
model.compile(optimizer='adam', loss=get_weighted_loss(pos_weights, neg_weights))
It gives me an error with message:
`` ValueError Traceback (most recent call last)
Cell In[73], line 2
1 # create the base pre-trained model
----> 2 base_model = DenseNet121(weights='/Users/awabe/Desktop/Project/PapilaDB/ClinicalData/densenet121_weights_tf_dim_ordering_tf_kernels.h5', include_top=False)
4 x = base_model.output
6 # add a global spatial average pooling layer
File /opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/applications/densenet.py:358, in DenseNet121(include_top, weights, input_tensor, input_shape, pooling, classes, classifier_activation)
345 #keras_export(
346 "keras.applications.densenet.DenseNet121", "keras.applications.DenseNet121"
347 )
(...)
355 classifier_activation="softmax",
356 ):
357 """Instantiates the Densenet121 architecture."""
--> 358 return DenseNet(
359 [6, 12, 24, 16],
360 include_top,
361 weights,
362 input_tensor,
363 input_shape,
364 pooling,
365 classes,
366 classifier_activation,
367 )
File /opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/applications/densenet.py:340, in DenseNet(blocks, include_top, weights, input_tensor, input_shape, pooling, classes, classifier_activation)
338 model.load_weights(weights_path)
339 elif weights is not None:
--> 340 model.load_weights(weights)
342 return model
File /opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.traceback)
68 # To get the full stack trace, call:
69 # tf.debugging.disable_traceback_filtering()
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File /opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/saving/hdf5_format.py:817, in load_weights_from_hdf5_group(f, model)
815 layer_names = filtered_layer_names
816 if len(layer_names) != len(filtered_layers):
--> 817 raise ValueError(
818 f"Layer count mismatch when loading weights from file. "
819 f"Model expected {len(filtered_layers)} layers, found "
820 f"{len(layer_names)} saved layers."
821 )
823 # We batch weight value assignments in a single backend call
824 # which provides a speedup in TensorFlow.
825 weight_value_tuples = []
ValueError: Layer count mismatch when loading weights from file. Model expected 241 layers, found 242 saved layers. ``

PyTorch NotImplementedError in forward

i'm making a code that classifies numbers by using pytorch
epochess =[]
train_losses = []
test_losses = []
acc_training =[]
acc_testing = []
for epoch in range (epochs):
train_acc, train_epoch_loss = train_CNN(model,loss_function, optimizer, train_load, device)
print('epoch',epoch ,'training loss',train_epoch_loss)
train_losses.append(train_epoch_loss)
print('epoch',epoch,'training accuracy',train_acc)
acc_training.append(train_acc)
test_acc, test_epoch_loss = validate_CNN(model, loss_function, test_load, device)
print('epoch',epoch,'testing loss',test_epoch_loss)
test_losses.append(test_epoch_loss)
print('epoch',epoch,'testing accuracy',test_acc)
acc_testing.append(test_acc)
epochess.append(epoch)
and I get an erreur , I was following the right path just like it said on youtube
here's the following erreur
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-17-0bcb51ebbc3d> in <module>
5 acc_testing = []
6 for epoch in range (epochs):
----> 7 train_acc, train_epoch_loss = train_CNN(model,loss_function, optimizer, train_load, device)
8 print('epoch',epoch ,'training loss',train_epoch_loss)
9 train_losses.append(train_epoch_loss)
2 frames
/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py in _forward_unimplemented(self, *input)
242 registered hooks while the latter silently ignores them.
243 """
--> 244 raise NotImplementedError(f"Module [{type(self).__name__}] is missing the required \"forward\" function")
245
246
NotImplementedError: Module [CNN] is missing the required "forward" function
How did you implement your model?
Did you use a built-in model from PyTorch or did you create a custom model?
If you created a custom model, make sure you use the right components from PyTorch (e.g. torch.nn.Linear and torch.nn.Conv2d https://pytorch.org/tutorials/beginner/introyt/modelsyt_tutorial.html), otherwise PyTorch might complain about certain functions missing, like is happening in your case.

RuntimeError: one of the variables needed for gradient computation has been modified: is at version 2; expected version 1 instead

I'm trying the following Kaggle.
TL;DR: I want to classify a sequence (time-series) of measurements to 1 of K classes using LSTM.
I'm trying to overfit the model on 2 sequences:
My input is (B, N, M):
B : batch-size = 1
N : sequence-size = 128
M : num-of-feature = 14 (number of measurements in each timestamp)
My model is a very simple LSTM:
class LSTMClassifier(nn.Module):
def __init__(self, in_dim, hidden_dim, out_dim, num_layers):
super(LSTMClassifier, self).__init__()
self.in_dim = in_dim
self.hidden_dim = hidden_dim
self.out_dim = out_dim
self.num_layers = num_layers
self.lstm = nn.LSTM(in_dim, hidden_dim, num_layers=num_layers, batch_first=True)
self.fc = nn.Linear(hidden_dim, out_dim)
def forward(self, x):
lstm_out, (ht, ct) = self.lstm(x)
y = self.fc(ht[-1].reshape(-1, self.hidden_dim))
return y
And the train process is:
def train_lstm_model(model, data_loader, num_epochs, loss_cls, optimizer_cls, learning_rate):
start = time.time()
loss = loss_cls()
optimizer = optimizer_cls(model.parameters(), lr=learning_rate)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
for epoch in tqdm(range(num_epochs)):
hidden = (torch.zeros((1, data_loader.batch_size, model.hidden_dim), requires_grad=True).to(device),
torch.zeros(1, data_loader.batch_size, model.hidden_dim, requires_grad=True).to(device))
for i, (batch_x, batch_y) in enumerate(data_loader):
batch_x = batch_x.to(device).float()
batch_y = batch_y.to(device).long()
optimizer.zero_grad()
y_predicted, hidden = model(batch_x, hidden)
l = loss(y_predicted, batch_y)
l.backward()
optimizer.step()
# print(f'epoch {epoch+1}, batch {i+1}: loss = {l.item()} |',
# f'train accuracy: {eval_lstm_model(model, data_loader.dataset, hidden)}')
end = time.time()
print(f'Training took {end-start} seconds.')
And my setup code is:
loss_cls = nn.CrossEntropyLoss
optimizer_cls = torch.optim.SGD
hidden_dim = 100
model_lstm = LSTMClassifier(X_of.shape[-1], hidden_dim, len(np.unique(y_train)))
learning_rate = 0.01
num_epochs = 1000
train_lstm_model(model_lstm, overfit_loader, num_epochs, loss_cls, optimizer_cls, learning_rate)
The overfit_loader is a DataLoader which contains only 2 samples.
But the training process outputs the following error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-87-5f725d0ecc50> in <module>
27 learning_rate = 0.001
28 num_epochs = 100
---> 29 train_lstm_model(model_lstm, overfit_loader, num_epochs, loss_cls, optimizer_cls, learning_rate)
<ipython-input-86-ba60b3627f13> in train_lstm_model(model, data_loader, num_epochs, loss_cls, optimizer_cls, learning_rate, test_loader)
20 l = loss(y_predicted, batch_y)
21
---> 22 l.backward(retain_graph=True)
23 optimizer.step()
24
/usr/local/lib64/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
219 retain_graph=retain_graph,
220 create_graph=create_graph)
--> 221 torch.autograd.backward(self, gradient, retain_graph, create_graph)
222
223 def register_hook(self, hook):
/usr/local/lib64/python3.6/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
130 Variable._execution_engine.run_backward(
131 tensors, grad_tensors_, retain_graph, create_graph,
--> 132 allow_unreachable=True) # allow_unreachable flag
133
134
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace
operation: [torch.cuda.FloatTensor [400, 14]] is at version 2; expected version 1 instead. Hint: the
backtrace further above shows the operation that failed to compute its gradient. The variable in question
was changed in there or anywhere later. Good luck!
EDIT: I've removed the loss printing and stop re-using the hidden, according to #SzymonMaszke comment, and the exception gone, but there's still a problem that the loss isn't converges below 0.7
I'd like to get some help please,
Thanks!

PyTorch AttributeError: 'UNet3D' object has no attribute 'size'

I am making an image segmentation transfer learning project using Pytorch. I am using the weights of this pre-trained model and class UNet3D.
https://github.com/MrGiovanni/ModelsGenesis
When I run the following codes I get this error at the line which MSELoss is called: "AttributeError: 'DataParallel' object has no attribute 'size' ".
When I delete the first line I get a similar error: "AttributeError: 'UNet3D' object has no attribute 'size'
"
How can I convert DataParallel or UNet3D class to an object which MSELoss can use? I do not need DataParallel for now. I need to run the UNet3D() class for transfer learning.
model = nn.DataParallel(model, device_ids = [i for i in range(torch.cuda.device_count())])
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), conf.lr, momentum=0.9, weight_decay=0.0, nesterov=False)
scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)
initial_epoch=10
for epoch in range(initial_epoch, conf.nb_epoch):
scheduler.step(epoch)
model.train()
for batch_ndx, (x,y) in enumerate(train_loader):
x, y = x.float().to(device), y.float().to(device)
pred = model
loss = criterion(pred, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-46-20d1943b3498> in <module>
25 x, y = x.float().to(device), y.float().to(device)
26 pred = model
---> 27 loss = criterion(pred, y)
28 optimizer.zero_grad()
29 loss.backward()
/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
430
431 def forward(self, input, target):
--> 432 return F.mse_loss(input, target, reduction=self.reduction)
433
434
/opt/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py in mse_loss(input, target, size_average, reduce, reduction)
2528 mse_loss, tens_ops, input, target, size_average=size_average, reduce=reduce,
2529 reduction=reduction)
-> 2530 if not (target.size() == input.size()):
2531 warnings.warn("Using a target size ({}) that is different to the input size ({}). "
2532 "This will likely lead to incorrect results due to broadcasting. "
/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
592 return modules[name]
593 raise AttributeError("'{}' object has no attribute '{}'".format(
--> 594 type(self).__name__, name))
595
596 def __setattr__(self, name, value):
AttributeError: 'UNet3D' object has no attribute 'size'
You have a typo on this line:
pred = model
should be
pred = model(x)
model is nn.Module object which describes the network. x, y, pred are (supposed to be) torch tensors.
Aside from this particular case, I think it would be good to think about how to solve this type of problems in general.
You saw an error (exception) on a certain line. Is the problem there, or earlier? Can you isolate the problem?
For example, if you print out the arguments you're passing to criterion(pred, y) just before the call, do they look right? (they don't)
What happens if you create a couple of tensors of the right shape just before the call and pass them instead? (works fine)
What is the error really saying? "AttributeError: 'UNet3D' object has no attribute 'size'" - well, of course it's not supposed to have a size, but why is the code trying to access it's size? Actually, why is the code even able to access that object on that line? (since the model is not supposed to be passed to the criterion function - right?)
Maybe useful further reading: https://ericlippert.com/2014/03/05/how-to-debug-small-programs/

how to fix "OperatorNotAllowedInGraphError " error in Tensorflow 2.0

I'm learn tensorflow2.0 from official tutorials.I can understand the result from below code.
def square_if_positive(x):
return [i ** 2 if i > 0 else i for i in x]
square_if_positive(range(-5, 5))
# result
[-5, -4, -3, -2, -1, 0, 1, 4, 9, 16]
But if I change the inputs with tensor not python code, just like this
def square_if_positive(x):
return [i ** 2 if i > 0 else i for i in x]
square_if_positive(tf.range(-5, 5))
I get below error!!
OperatorNotAllowedInGraphError Traceback (most recent call last)
<ipython-input-39-6c17f29a3443> in <module>
2 def square_if_positive(x):
3 return [i**2 if i > 0 else i for i in x]
----> 4 square_if_positive(tf.range(10))
5 # measure_graph_size(square_if_positive, range(10))
~/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py in __call__(self, *args, **kwds)
437 # This is the first call of __call__, so we have to initialize.
438 initializer_map = {}
--> 439 self._initialize(args, kwds, add_initializers_to=initializer_map)
440 if self._created_variables:
441 try:
~/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py in _initialize(self, args, kwds, add_initializers_to)
380 self._concrete_stateful_fn = (
381 self._stateful_fn._get_concrete_function_internal_garbage_collected( # pylint: disable=protected-access
--> 382 *args, **kwds))
383
384 def invalid_creator_scope(*unused_args, **unused_kwds):
~/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
1793 if self.input_signature:
1794 args, kwargs = None, None
-> 1795 graph_function, _, _ = self._maybe_define_function(args, kwargs)
1796 return graph_function
1797
~/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py in _maybe_define_function(self, args, kwargs)
2093 graph_function = self._function_cache.primary.get(cache_key, None)
2094 if graph_function is None:
-> 2095 graph_function = self._create_graph_function(args, kwargs)
2096 self._function_cache.primary[cache_key] = graph_function
2097 return graph_function, args, kwargs
~/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
1984 arg_names=arg_names,
1985 override_flat_arg_shapes=override_flat_arg_shapes,
-> 1986 capture_by_value=self._capture_by_value),
1987 self._function_attributes,
1988 # Tell the ConcreteFunction to clean up its graph once it goes out of
~/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
851 converted_func)
852
--> 853 func_outputs = python_func(*func_args, **func_kwargs)
854
855 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
~/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py in wrapped_fn(*args, **kwds)
323 # __wrapped__ allows AutoGraph to swap in a converted function. We give
324 # the function a weak reference to itself to avoid a reference cycle.
--> 325 return weak_wrapped_fn().__wrapped__(*args, **kwds)
326 weak_wrapped_fn = weakref.ref(wrapped_fn)
327
~/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py in wrapper(*args, **kwargs)
841 except Exception as e: # pylint:disable=broad-except
842 if hasattr(e, "ag_error_metadata"):
--> 843 raise e.ag_error_metadata.to_exception(type(e))
844 else:
845 raise
OperatorNotAllowedInGraphError: in converted code:
<ipython-input-37-6c17f29a3443>:3 square_if_positive *
return [i**2 if i > 0 else i for i in x]
/Users/zhangpan/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:547 __iter__
self._disallow_iteration()
/Users/zhangpan/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:540 _disallow_iteration
self._disallow_when_autograph_enabled("iterating over `tf.Tensor`")
/Users/zhangpan/tf2_workspace/tf2.0/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:518 _disallow_when_autograph_enabled
" decorating it directly with #tf.function.".format(task))
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did not convert this function. Try decorating it directly with #tf.function.
I can't find any specifications about this error. I think the real reason is not "iterating over tf.Tensor is not allowed" . Becase I can write like this.
#tf.function
def square_if_positive(x):
for i in x:
if i>0:
tf.print(i**2)
else:
tf.print(i)
square_if_positive(tf.range(10))
I iterate over tensor just like above code.
So my question is what's the real reason about this error? Any suggestions will help me. I really can't understand this error through I read a lot of materials.
The root cause is that autograph doesn't yet support list comprehensions (primarily because it's difficult to determine the dtype of the result in all cases)
As a workaround, you can use tf.map_fn for the comprehension:
return tf.map_fn(lambda i: i ** 2 if i > 0 else i, x)
For more information please take a look at this issue
In case it helps someone.
I had the same problem with a code that did:
for index, image in enumerate(inputs):
... My code ...
The solution was just to do:
index = 0
for image in inputs:
.... My code ...
index += 1
I had a similar issue when using tf.range() instead of python's range() for a list comprehension inside a tensorflow graph function. I was training a 3D segmentation neural net and had to use range() for the code to work.
Check the pseudo code below:-
Y = # [Batch,Height,Width,Depth,Channels]
y_predict = # [B,H,W,D,C,MC_Runs] ; MC_Runs=Monte Carlo Runs
#tf.function
def train_loss(Y,y_predict):
# calulate loss and return scalar value
#tf.function
def train_step():
loss = [train_loss(Y, y_predict[:,:,:,:,:,id_])) for id_ in range(MC_RUNS)]
loss = tf.math.reduce_mean(loss)