In Keras(Deep learning library), RepeatVector + TimeDistributed = Error? - deep-learning

Hi I'm a beginner keras.
I'm making some model.
step 1. Input batch and word list, (BATCH_SIZE, WORD_INDEX_LIST)
step 2. Get word embeddings each words (BATCH_SIZE, WORD_LENGTH, EMBEDDING_SIZE)
step 3. Average each each word embeddings in each batch. (BATCH_SIZE, EMBEDDING_SIZE)
step 4. Repeat vector N, (BATCH_SIZE, N, EMBEDDING_SIZE)
step 5. Apply Dense Layer each time step
So, I write code.
MAX_LEN = 20 ( = WORD_INDEX_LIST)
step 1
layer_target_input = Input(shape=(MAX_LEN,), dtype="int32", name="layer_target_input")
# step2
layer_embedding = Embedding(input_dim = n_symbols+1, output_dim=vector_dim,input_length=MAX_LEN,
name="embedding", weights= [embedding_weights],trainable = False)
encoded_target = layer_embedding(layer_target_input)
# step 3
encoded_target_agg = KL.core.Lambda( lambda x: K.sum(x, axis=1) )(encoded_target)
#step 4
encoded_target_agg_repeat = KL.RepeatVector( MAX_LEN)(encoded_target_agg)
# step 5
layer_annotated_tahn = KL.Dense(output_dim=50, name="layer_tahn")
layer_annotated_tahn_td = KL.TimeDistributed(layer_annotated_tahn) (encoded_target_agg_repeat)
model = KM.Model(input=[layer_target_input], output=[ layer_annotated_tahn_td])
r = model.predict({ "layer_target_input":dev_targ}) # dev_targ = (2, 20, 300)
But, when i run this code,
result is bellow.
Traceback (most recent call last):
File "Main.py", line 127, in <module>
r = model.predict({ "layer_target_input":dev_targ})
File "/usr/local/anaconda/lib/python2.7/site-packages/Keras-1.0.7-py2.7.egg/keras/engine/training.py", line 1180, in predict
batch_size=batch_size, verbose=verbose)
File "/usr/local/anaconda/lib/python2.7/site-packages/Keras-1.0.7-py2.7.egg/keras/engine/training.py", line 888, in _predict_loop
outs[i][batch_start:batch_end] = batch_out
ValueError: could not broadcast input array from shape (30,20,50) into shape (2,20,50)
why batch size is changed?
What I have wrong?

The problem is in Lambda operator. In your case it takes a tensor of shape (batch_size, max_len, embedding_size) and is expected to produce a tensor of shape (batch_size, embedding_size). However, the Lambda op doesn't know what transformation you apply internally, and therefore during the graph compilation mistakenly assumes that the shape doesn't change, therefore assuming that the output shape is (batch_size, max_len, embedding_size). The RepeastVector that follows expects the input to be two-dimensional, but never asserts that it is the case. The way it produces the expected shape is (batch_size, num_repetitions, in_shape[1]). Since Lambda mistakenly reported its shape as (batch_size, max_len, embedding_size), RepeatVector now reports its shape as (batch_size, num_repetitions, max_len) instead of expected (batch_size, num_repetitions, embedding_size). num_repetitions in your case is the same as max_len, so RepeastVector reports its shape as (batch_size, max_len, max_len). The way TimeDistributed(Dense) works is:
Reshape((-1, input_shape[2]))
Dense()
Reshape((-1, input_shape[1], num_outputs))
By now input_shape[2] is mistakenly assumed to be max_len instead of embedding_size, but the actual tensor that is given has correct shape of (batch_size, max_len, embedding_size), so what ends up happening is:
Reshape((batch_size * embedding_size, max_len))
Dense()
Reshape((batch_size * embedding_size / max_len, max_len, num_outputs))
In your case batch_size * embedding_size / max_len happens to be 2 * 300 / 20 = 30, that's where your wrong shape comes from.
To fix it, you need to explicitly tell Lambda the shape you want it to produce:
encoded_target_agg = KL.core.Lambda( lambda x: K.sum(x, axis=1), output_shape=(vector_dim,))(encoded_target)

Related

Do linear layer after GRU saved the sequence output order?

I'm dealing with the following senario:
My input has the shape of: [batch_size, input_sequence_length, input_features]
where:
input_sequence_length = 10
input_features = 3
My output has the shape of: [batch_size, output_sequence_length]
where:
output_sequence_length = 5
i.e: for each time slot of 10 units (each slot with 3 features) I need to predict the next 5 slots values.
I built the following model:
import torch
import torch.nn as nn
import torchinfo
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.GRU = nn.GRU(input_size=3, hidden_size=32, num_layers=2, batch_first=True)
self.fc = nn.Linear(32, 5)
def forward(self, input_series):
output, h = self.GRU(input_series)
output = output[:, -1, :] # get last state
output = self.fc(output)
output = output.view(-1, 5, 1) # reorginize output
return output
torchinfo.summary(MyModel(), (512, 10, 3))
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
MyModel [512, 5, 1] --
├─GRU: 1-1 [512, 10, 32] 9,888
├─Linear: 1-2 [512, 5] 165
==========================================================================================
I'm getting good results (very small MSE loss, and the predictions looks good),
but I'm not sure if the model output (5 sequence values) are really ordered by the model ?
i.e the second output based on the first output and the third output based on the second output ...
I know that the GRU output based on the learned sequence history.
But I'm also used linear layer, so is the output (after the linear layer) still sorted by time ?
UPDATE
This answer isn't quite right, see this follow-up question. The best way is to write the math and show that the 5 scalar outputs aren't functions of each other.
Old Answer
I'm not sure if the model output (5 sequence values) are really ordered by the model ? i.e the second output based on the first output and the third output based on the second output
No, they aren't. You can check that the gradients of, say, the last output w.r.t to the previous outputs are zeroes, which basically means that the last output isn't a function of the previous outputs.
model = MyModel()
x = torch.rand([2, 10, 3])
y = model(x)
y.retain_grad() # allows accessing y.grad although y is a non-leaf Tensor
y[:, -1].sum().backward() # computes gradients of last output
assert torch.allclose(y.grad[:, :-1], torch.tensor(0.)) # gradients w.r.t previous outputs are zeroes
A popular model to capture dependencies among output labels is conditional random fields. But since you're already happy with the predictions of the current model, perhaps modelling the output dependencies isn't that important.

Gradio - Pytorch MNIST Digit Recognizer

I watched the following video on YouTube https://www.youtube.com/watch?v=jx9iyQZhSwI where it was shown that it is possible to use Gradio and the learned model of MNIST dataset in Tensorflow. I have read and written that it is possible to use Pytorch in Gradio, but I have problems with its implementation. Does anyone have an idea how to do this?
My Pytorch code of cnn
import torch.nn as nn
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(
in_channels=1,
out_channels=16,
kernel_size=5,
stride=1,
padding=2,
),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
self.conv2 = nn.Sequential(
nn.Conv2d(16, 32, 5, 1, 2),
nn.ReLU(),
nn.MaxPool2d(2),
)
# fully connected layer, output 10 classes
self.out = nn.Linear(32 * 7 * 7, 10)
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
# flatten the output of conv2 to (batch_size, 32 * 7 * 7)
x = x.view(x.size(0), -1)
output = self.out(x)
return output, x # return x for visualization
By watching I find that I need to change function that Gradio use
def predict_image(img):
img_3d=img.reshape(-1,28,28)
im_resize=img_3d/255.0
prediction=CNN(im_resize)
pred=np.argmax(prediction)
return pred
Im sorry if I got your question wrong, but from what I understand you are getting an error when trying to predict the digit using your function predict image.
So here are two possible hints. Maybe you have implemented them already, but I don't know because of the very small code snippet.
First of all. Have you set your model into evaluation mode using
CNN.eval()
Do after you finished training your model and want to evaluate inputs without training the model.
Second of all, maybe you need to add a fourth dimension to your input tensor "im_resize". Normally your model expects a dimension for the number of channels, the batch size, the height and the width of your input.
In addition I can not tell if your input is a of the datatype torch.tensor . If not transform your array into a tensor first.
You can add a batch dimension to your input tensor by using
im_resize = im_resize.unsqueeze(0)
I hope that I understand your question correctly and was able to help you.

Why/how can model.forward() succeed both on input being mini-batch vs single item?

Why and how does this work?
When I run the forward phase on input
being mini-batch tensor
or alternatively being a single input item
model.__call__() (which AFAIK is calling forward() ) swallows that and spills out adequate output (i.e. a tensor of mini-batch of estimates or a single item of estimate)
Adopting testcode from the Pytorch NN example shows what I mean, but I don't get it.
I would expect it to create problems and me forced to transform the single item input into a mini-batch of size 1( reshape (1,xxx)) or likewise, like I did in the code below.
( I did variations of the test to be sure it is e.g. not depending on execution order )
# -*- coding: utf-8 -*-
import torch
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
#N, D_in, H, D_out = 64, 1000, 100, 10
N, D_in, H, D_out = 64, 10, 4, 3
# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. Each Linear Module computes output from input using a
# linear function, and holds internal Tensors for its weight and bias.
model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out),
)
# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')
learning_rate = 1e-4
for t in range(1):
# Forward pass: compute predicted y by passing x to the model. Module objects
# override the __call__ operator so you can call them like functions. When
# doing so you pass a Tensor of input data to the Module and it produces
# a Tensor of output data.
model.eval()
print ("###########")
print ("x[0]",x[0])
print ("x[0].size()", x[0].size())
y_1pred = model(x[0])
print ("y_1pred.size()", y_1pred.size())
print (y_1pred)
model.eval()
print ("###########")
print ("x.size()", x.size())
y_pred = model(x)
print ("y_pred.size()", y_pred.size())
print ("y_pred[0]", y_pred[0])
print ("###########")
model.eval()
input_item = x[0]
batch_len1_shape = torch.Size([1,*(input_item.size())])
batch_len1 = input_item.reshape(batch_len1_shape)
y_pred_batch_len1 = model(batch_len1)
print ("input_item",input_item)
print ("input_item.size()", input_item.size())
print ("y_pred_batch_len1.size()", y_pred_batch_len1.size())
print (y_1pred)
raise Exception
This is the output it generates:
###########
x[0] tensor([-1.3901, -0.2659, 0.4352, -0.6890, 0.1098, -0.3124, 0.6419, 1.1004,
-0.7910, -0.5389])
x[0].size() torch.Size([10])
y_1pred.size() torch.Size([3])
tensor([-0.5366, -0.4826, 0.0538], grad_fn=<AddBackward0>)
###########
x.size() torch.Size([64, 10])
y_pred.size() torch.Size([64, 3])
y_pred[0] tensor([-0.5366, -0.4826, 0.0538], grad_fn=<SelectBackward>)
###########
input_item tensor([-1.3901, -0.2659, 0.4352, -0.6890, 0.1098, -0.3124, 0.6419, 1.1004,
-0.7910, -0.5389])
input_item.size() torch.Size([10])
y_pred_batch_len1.size() torch.Size([1, 3])
tensor([-0.5366, -0.4826, 0.0538], grad_fn=<AddBackward0>)
The docs on nn.Linear state that
Input: (N,∗,in_features) where ∗ means any number of additional dimensions
so one would naturally expect that at least two dimensions are necessary. However, if we look under the hood we will see that Linear is implemented in terms of nn.functional.linear, which dispatches to torch.addmm or torch.matmul (depending whether bias == True) which broadcast their argument.
So this behavior is likely a bug (or an error in documentation) and I would not depend on it working in the future, if I were you.

How do I get CSV files into an Estimator in Tensorflow 1.6

I am new to tensorflow (and my first question in StackOverflow)
As a learning tool, I am trying to do something simple. (4 days later I am still confused)
I have one CSV file with 36 columns (3500 records) with 0s and 1s.
I am envisioning this file as a flattened 6x6 matrix.
I have another CSV file with 1 columnn of ground truth 0 or 1 (3500 records) which indicates if at least 4 of the 6 of elements in the 6x6 matrix's diagonal are 1's.
I am not sure I have processed the CSV files correctly.
I am confused as to how I create the features dictionary and Labels and how that fits into the DNNClassifier
I am using TensorFlow 1.6, Python 3.6
Below is the small amount of code I have so far.
import tensorflow as tf
import os
def x_map(line):
rDefaults = [[] for cl in range(36)]
x_row = tf.decode_csv(line, record_defaults=rDefaults)
return x_row
def y_map(line):
line = tf.string_to_number(line, out_type=tf.int32)
y_row = tf.one_hot(line, depth=2)
return y_row
x_path_file = os.path.join('D:', 'Diag', '6x6_train.csv')
y_path_file = os.path.join('D:', 'Diag', 'HasDiag_train.csv')
filenames = [x_path_file]
x_dataset = tf.data.TextLineDataset(filenames)
x_dataset = x_dataset.map(x_map)
x_dataset = x_dataset.batch(1)
x_iter = x_dataset.make_one_shot_iterator()
x_next_el = x_iter.get_next()
filenames = [y_path_file]
y_dataset = tf.data.TextLineDataset(filenames)
y_dataset = y_dataset.map(y_map)
y_dataset = y_dataset.batch(1)
y_iter = y_dataset.make_one_shot_iterator()
y_next_el = y_iter.get_next()
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
x_el = (sess.run(x_next_el))
y_el = (sess.run(y_next_el))
The output for x_el is:
(array([1.], dtype=float32), array([1.], dtype=float32), array([1.], dtype=float32), array([1.], dtype=float32), array([1.], dtype=float32), array([0.] ... it goes on...
The output for y_el is:
[[1. 0.]]
You're pretty much there for a minimal working model. The main issue I see is that tf.decode_csv returns a tuple of tensors, where as I expect you want a single tensor with all values. Easy fix:
x_row = tf.stack(tf.decode_csv(line, record_defaults=rDefaults))
That should work... but it fails to take advantage of many of the awesome things the tf.data.Dataset API has to offer, like shuffling, parallel threading etc. For example, if you shuffle each dataset, those shuffling operations won't be consistent. This is because you've created two separate datasets and manipulated them independently. If you create them independently, zip them together then manipulate, those manipulations will be consistent.
Try something along these lines:
def get_inputs(
count=None, shuffle=True, buffer_size=1000, batch_size=32,
num_parallel_calls=8, x_paths=[x_path_file], y_paths=[y_path_file]):
"""
Get x, y inputs.
Args:
count: number of epochs. None indicates infinite epochs.
shuffle: whether or not to shuffle the dataset
buffer_size: used in shuffle
batch_size: size of batch. See outputs below
num_parallel_calls: used in map. Note if > 1, intra-batch ordering
will be shuffled
x_paths: list of paths to x-value files.
y_paths: list of paths to y-value files.
Returns:
x: (batch_size, 6, 6) tensor
y: (batch_size, 2) tensor of 1-hot labels
"""
def x_map(line):
rDefaults = [[] for cl in range(n_dims**2)]
x_row = tf.stack(tf.decode_csv(line, record_defaults=rDefaults))
return x_row
def y_map(line):
line = tf.string_to_number(line, out_type=tf.int32)
y_row = tf.one_hot(line, depth=2)
return y_row
def xy_map(x, y):
return x_map(x), y_map(y)
x_ds = tf.data.TextLineDataset(x_paths)
y_ds = tf.data.TextLineDataset(y_paths)
combined = tf.data.Dataset.zip((x_ds, y_ds))
combined = combined.repeat(count=count)
if shuffle:
combined = combined.shuffle(buffer_size)
combined = combined.map(xy_map, num_parallel_calls=num_parallel_calls)
combined = combined.batch(batch_size)
x, y = combined.make_one_shot_iterator().get_next()
return x, y
To experiment/debug,
x, y = get_inputs()
with tf.Session() as sess:
xv, yv = sess.run((x, y))
print(xv.shape, yv.shape)
For use in an estimator, pass the function itself.
estimator.train(get_inputs, max_steps=10000)
def get_eval_inputs():
return get_inputs(
count=1, shuffle=False
x_paths=[x_eval_paths],
y_paths=[y_eval_paths])
estimator.eval(get_eval_inputs)

unknown resampling filter error when trying to create my own dataset with pytorch

I am trying to create a CNN implemented with data augmentation in pytorch to classify dogs and cats. The issue that I am having is that when I try to input my dataset and enumerate through it I keep getting this error:
Traceback (most recent call last):
File "<ipython-input-55-6337e0536bae>", line 75, in <module>
for i, (inputs, labels) in enumerate(trainloader):
File "/usr/local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 188, in __next__
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/usr/local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 188, in <listcomp>
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/usr/local/lib/python3.6/site-packages/torchvision/datasets/folder.py", line 124, in __getitem__
img = self.transform(img)
File "/usr/local/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 42, in __call__
img = t(img)
File "/usr/local/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 147, in __call__
return F.resize(img, self.size, self.interpolation)
File "/usr/local/lib/python3.6/site-packages/torchvision/transforms/functional.py", line 197, in resize
return img.resize((ow, oh), interpolation)
File "/usr/local/lib/python3.6/site-packages/PIL/Image.py", line 1724, in resize
raise ValueError("unknown resampling filter")
ValueError: unknown resampling filter
and I really dont know whats wrong with my code. I have provided the code below:
# Creating the CNN
# Importing the libraries
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
import torchvision
from torchvision import transforms
#Creating the CNN Model
class CNN(nn.Module):
def __init__(self, nb_outputs):
super(CNN, self).__init__() #activates the inheritance and allows the use of all the tools in the nn.Module
#making the 3 convolutional layers that will be used in the convolutional neural network
self.convolution1 = nn.Conv2d(in_channels = 1, out_channels = 32, kernel_size = 5) #kernal_size -> the deminson of the feature detector e.g kernel_size = 5 => feature detector of size 5x5
self.convolution2 = nn.Conv2d(in_channels = 32, out_channels = 64, kernel_size = 2)
#making 2 full connections one to connect the inputs of the ANN to the hidden layer and another to connect the hidden layer to the outputs of the ANN
self.fc1 = nn.Linear(in_features = self.count_neurons((1, 64,64)), out_features = 40)
self.fc2 = nn.Linear(in_features = 40, out_features = nb_outputs)
def count_neurons(self, image_dim):
x = Variable(torch.rand(1, *image_dim)) #this variable repersents a fake image to allow us to compute the number of neruons
#in order to pass the elements of the tuple image_dim into our function as a list of arguments we need to add a * before image_dim
#since x will be going into our neural network we need to convert it into a torch variable using the Variable() function
x = F.relu(F.max_pool2d(self.convolution1(x), 3, 2)) #first we apply the convolution to x then apply max_pooling to the convolutional fake images and then activate all the neurons in the pooling layer
x = F.relu(F.max_pool2d(self.convolution2(x), 3, 2)) #the signals are now propragated up to the thrid convoulational layer
#Now to flatten x to obtain the number of neurons in the flattening layer
return x.data.view(1, -1).size(1) #this will flatten x into a huge vector and returns the size of the vector, that size repersents the number of neurons that will be inputted into the ANN
#even though x is not a real image from the game since the size of the flattened vector only depends on the dimention of the inputted image we can just set x to have the same dimentions as the image
def forward(self, x):
x = F.relu(F.max_pool2d(self.convolution1(x), 3, 2)) #first we apply the convolution to x then apply max_pooling to the convolutional fake images and then activate all the neurons in the pooling layer
x = F.relu(F.max_pool2d(self.convolution2(x), 3, 2))
#flattening layer of the CNN
x = x.view(x.size(0), -1)
#x is now the inputs to the ANN
x = F.relu(self.fc1(x)) #we propagte the signals from the flatten layer to the full connected layer and activate the neruons by breaking the linearilty with the relu function
x = F.sigmoid(self.fc2(x))
#x is now the output neurons of the ANN
return x
train_tf = transforms.Compose([transforms.RandomHorizontalFlip(),
transforms.Resize(64,64),
transforms.RandomRotation(20),
transforms.RandomGrayscale(.2),
transforms.ToTensor()])
test_tf = transforms.Compose([transforms.Resize(64,64),
transforms.ToTensor()])
training_set = torchvision.datasets.ImageFolder(root = './dataset/training_set',
transform = train_tf)
test_set = torchvision.datasets.ImageFolder(root = './dataset/test_set',
transform = transforms.Compose([transforms.Resize(64,64),
transforms.ToTensor()]) )
trainloader = torch.utils.data.DataLoader(training_set, batch_size=32,
shuffle=True, num_workers=0)
testloader = torch.utils.data.DataLoader(test_set, batch_size= 32,
shuffle=False, num_workers=0)
#training the model
cnn = CNN(1)
cnn.train()
loss = nn.BCELoss()
optimizer = optim.Adam(cnn.parameters(), lr = 0.001) #the optimizer => Adam optimizer
nb_epochs = 25
for epoch in range(nb_epochs):
train_loss = 0.0
train_acc = 0.0
total = 0.0
for i, (inputs, labels) in enumerate(trainloader):
inputs, labels = Variable(inputs), Variable(labels)
cnn.zero_grad()
outputs = cnn(inputs)
loss_error = loss(outputs, labels)
optimizer.step()
_, pred = torch.max(outputs.data, 1)
total += labels.size(0)
train_loss += loss_error.data[0]
train_acc += (pred == labels).sum()
train_loss = train_loss/len(training_loader)
train_acc = train_acc/total
print('Epoch: %d, loss: %.4f, accuracy: %.4f' %(epoch+1, train_loss, train_acc))
The folder arrangement for the code is /dataset/training_set and inside the training_set folder are two more folders one for all the cat images and the other for all the dog images. Each image is name either dog.xxxx.jpg or cat.xxxx.jpg, where the xxxx represents the number so for the first cat image it would be cat.1.jpg up to cat.4000.jpg. This is the same format for the test_set folder. The number of training images is 8000 and the number of test images is 2000. If anyone can point out my error I would greatly appreciate it.
Thank you
Try to set the desired size in transforms.Resize as a tuple:
transforms.Resize((64, 64))
PIL is using the second argument (in your case 64) as the interpolation method.
in torchvision.transforms.Compose([put every transform in these brackets]),
This, will not give the error.