Neural Machine Translation - deep-learning

I am doing a project about Neural Machine Translation. In the data processing step, is it necessary to padding after the sentences so that they are of equal length?

Related

How graph neural networks work for molecular generation?

I am learning AI for the sake of applying it to the field of chemistry, specifically, molecular generation. I finished learning about how to generate novel molecules using RNN-based architectures such as GRU and LSTM. The process in these architectures is as follows:
the input is a character of a known molecule (represented in a string format), the model's task is to predict the next character, therefore the output is a softmax probability over all characters. Then the loss is computed as the cross-entropy between the predicted character and the real one.
I am now moving to graph neural networks as they seem to provide more advantages over RNN-based architectures. Although I did my research, I could not understand how they work for this task (i.e., molecule generation). I mean, similarly, what is the input, the output, and the loss function that we are trying to minimize in GNN-based molecular generation? Thanks in advance.

How to Train End to End model with Part RNN and Part Graph Neural Network?

I am currently working on a project in which I need to use RNNs in part of a neural network. Essentially the RNN would take in a text of variable length and output a feature representation. This feature representation would then be clubbed with some more feature vectors and fed to a different graph neural network. Some loss would be calculated on the output of the graph neural network and this loss would be backpropagated across the entire network including the RNN and be used to train the entire end to end network.
However, I am not able to wrap my head around how I can use the RNN as a part of another different non-sequential model. I use PyTorch for most of my work.
Can anyone suggest any way in which I may address this problem. Or refer to any material which might be useful.
Thanks

how to train pre-trained CNN on new dataset which is not organised in classes (Unsupervised)

I have a pretrained CNN (Resnet-18) trained on Imagenet, now i want to extend it on my own dataset of video frames , now the point is all tutorials i found on Finetuning required dataset to be organised in classes like
class1/train/
class1/test/
class2/train/
class2/test/
but i have only frames on many videos , how will i train my CNN on it.
So can anyone point me in right direction , any tutorial or paper etc ?
PS: My final task is to get deep features of all frames that i provide at the time of testing
for training network, you should have some 'label'(sometimes called y) of your input data. from there, network calculate loss between logit(answer of network) and the given label.
And the network will self-revise using that loss value by backpropagating. that process is what we call 'training'.
Because you only have input data, not label, so you can get the logit only. that means a loss cannot be calculated.
Fine tuning is almost same word with 'additional training', so that you cannot fine tuning your pre-trained network without labeled data.
About train set & test set, that is not the problem right now.
If you have enough labeled input data, you can divide it with some ratio.
(e.g. 80% of data for training, 20% of data for testing)
the reason why divide data into these two sets, we want to check the performance of our trained network more general, unseen situation.
However, if you just input your data into pre-trained network(encoder part), it will give a deep feature. It doesn't exactly fit to your task, still it is deep feature.
Added)
Unsupervised pre-training for convolutional neural network in theano
here is the method you need, deep feature encoder in unsupervised situation. I hope it will help.

Testing a network without setting number of iterations

I have a pre-trained network with which I would like to test my data. I defined the network architecture using a .prototxt and my data layer is a custom Python Layer that receives a .txt file with the path of my data and its label, preprocess it and then feed to the network.
At the end of the network, I have a custom Python layer that get the class prediction made by the net and the label (from the first layer) and print, for example, the accuracy regarding all batches.
I would like to run the network until all examples have passed through the net.
However, while searching for the command to test a network, I've found:
caffe test -model architecture.prototxt -weights model.caffemodel -gpu 0 -iterations 100
If I don't set the -iterations, it uses the default value (50).
Does any of you know a way to run caffe test without setting the number of iterations?
Thank you very much for your help!
No, Caffe does not have a facility to detect that it has run exactly one epoch (use each input vector exactly once). You could write a validation input routine to do that, but Caffe expects you to supply the quantity. This way, you can generate easily comparable results for a variety of validation data sets. However, I agree that it would be a convenient feature.
The lack of this feature might be related to its lack for training and the interstitial testing.
In training, we tune the hyper-parameters to get the most accurate model for a given application. As it turns out, this is more closely dependent on TOTAL_NUM than on the number of epochs (given a sufficiently large training set).
With a fixed training set, we often graph accuracy (y-axis) against epochs (x-axis), because that gives tractable results as we adjust batch size. However, if we cut the size of the training set in half, the most comparable graph would scale on TOTAL_NUM rather than the epoch number.
Also, by restricting the size of the test set, we avoid long waits for that feedback during training. For instance, in training against the ImageNet data set (1.2M images), I generally test with around 1000 images, typically no more than 5 times per epoch.

Can I train a deep convolutional network without GPUs?

I am thinking of building a convolutional neural network as a tracking system application.I get the feeling that all the deep network applications require the use of GPUs. Is it necessary to use GPUs in a task like mine? What are the minimum PC requirements I should have in my laptop ?
It all depends on the size and depth of your CNN. If your CNN has one convolution layer, and one fully connected layer, and input images are 64x64, you will be able to train your network on your Laptop in a reasonable time. If you use GoogLeNet with hundred of layers, and train on the entire ImageNet set, than even with a video card it will take you a week, so on a CPU it will never finish training.
For most practical applications, however, it is desirable to have a GPU to train a convolution network. Note that on AWS you can get GPU-enabled instances for a rather reasonable price, especially if you get spot instances, so you don't necessarily need to have a GPU locally.
Last note: most of the frameworks (theano, torch, caffe, mxnet, tensorflow) allow you to execute the same model on CPU and on GPU with minor or no modifications to the code, so you can prototype locally on the CPU with a small set of images, and then when your model works, train it on AWS on a GPU instance.