does RNN suitable with kaggle Titanic - Machine Learning from Disaster - csv

I've found a tutorial in Kaggle web site that explains how to use RNN (Recurrent Neural Network) on the titanic data set in order to predict who survived.
my question is - how come RNN is suitable for this problem?
I thought RNN is not suitable for problems with csv file as a data set.
link to the tutorial (you can find the csv files in there) - https://www.kaggle.com/lusob04/titanic-rnn
and here is a sample of the dataset -
and another question - do you think CNN or RL is better suited for this problem?

Just like you suspected, RNN is not suitable for this kind of problem.
The link you shared offer a way to use RNN to solve the problem but I think using RNN in this case is an over-kill and that other, more simple models might even get you better results.

Related

Stacking/chaining CNNs for different use cases

So I'm getting more and more into deep learning using CNNs.
I was wondering if there are examples of "chained" (I don't know what the correct term would be) CNNs - what I mean by that is, using e.g. a first CNN to perform a semantic segmentation task while using its output as input for a second CNN which for example performs a classification task.
My questions would be:
What is the correct term for this sequential use of neural networks?
Is there a way to pack multiple networks into one "big" network which can be trained in one a single step instead of training 2 models and combining them.
Also if anyone could maybe provide a link so I could read about that kind of stuff, I'd really appreciate it.
Thanks a lot in advance!
Sequential use of independent neural networks can have different interpretations:
The first model may be viewed as a feature extractor and the second one is a classifier.
It may be viewed as a special case of stacking (stacked generalization) with a single model on the first level.
It is a common practice in deep learning to chain multiple models together and train them jointly. Usually it calls end-to-end learning. Please see the answer about it: https://ai.stackexchange.com/questions/16575/what-does-end-to-end-training-mean

Custom translator - Model adjustment after training

I've used three parallel sentence files to train my custom translator model. No dictionary files and no tuning files too. After training is finished and I've checked test results, I want to make some adjustments in the model. And here are several questions:
Is it possible to tune the model after training? Am I right that the model can't be changed and the only way is to train a new model?
The best approach to adjusting the model is to use tune files. Is it correct?
There is no way to see an autogenerated tune file, so I have to provide my own tuning file for a more manageable tuning process. Is it so?
Could you please describe how the tuning file is generated, when I have 3 sentence files with different amount of sentences, which is: 55k, 24k and 58k lines. Are all tuning sentences is from the first file or from all three files proportionally to their size? Which logic is used?
I wish there were more authoritative answers on this, I'll share what I know as a fellow user.
What Microsoft Custom Translator calls "tuning data" is what is normally known as a validation set. It's just a way to avoid overfitting.
Is it possible to tune the model after training? Am I right that the model can't be changed and the only way is to train a new model?
Yes, with Microsoft Custom Translator you can only train a model based on the generic category you have selected for the project.
(With Google AutoML technically you can choose to train a new model based on one of your previous custom models. However, it's also not usable without some trial and error.)
The best approach to adjusting the model is to use tune files. Is it correct?
It's hard to make a definitive statement on this. The training set also has an effect. A good validation set on top of a bad training set won't get us good results.
There is no way to see an autogenerated tune file, so I have to provide my own tuning file for a more manageable tuning process. Is it so?
Yes, it seems to me that if you let it decide how to split the training set into the training set, tuning set and test set, you can only download the training set and the test set.
Maybe neither includes the tuning set, so theoretically you can diff them. But that doesn't solve the problem of the split being different between different models.
... Which logic is used?
Good question.

Where to find deep learning based prediction model

I need to find a deep learning based prediction model, where can I find it?
You can use Pytorch and Tensorflow pretrained models.
https://pytorch.org/docs/stable/torchvision/models.html
They can be automatically downloaded. There are some sample codes, that you can try:
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py
If you are interested in deep learning, I suggest you review the basics of it in cs231n stanford. Your question is a bit odd, because you first need to define your task specifically. Prediction is not a good description. You could look for models for classification, segmentation, object detection, sequence2sequence(like translation), and so on...
Then you need to know how to search through projects on github, and then you need to know python (in most cases), and then use a pretrained model or use your own dataset to train or fine-tune the model for that task. Then you could pray that you have found a good model for your task, after that you need to validate the results on a test set. However, implementation of a model for real-life scenarios is another thing that you need to consider many other things, and you usually need some online-learning strategy, like Federated Learning. I hope that I could help you.

How to creat CNN model in Image Recognition with Tensorflow to compare with Inception v3

I'm studying Image Recognition with Tensorflow. I already read about the topic How to retrain Inception's Layer for new categories on Tensorflow.org, which utilize the Inception v3 training model.
Now, I desire to creat my own CNN model in order to compare with Inception v3, but I don't know how can I begin with.
Anyone knows some guides step-by-step on this problem?
I'd appreciate any your suggestion
Thanks in advance
First baby steps
The gold standard for getting started in image recognition is processing MNIST images. Tensorflow has a great tutorial on how to get started and also how to move to convolutional networks.
From there it is a long hard road to compete with Inception without just copying someone else's graph. You'll probably want to get a feel for what the different layers of convolution do. I created a basic Tensorflow Tutorial which contains an example python file that demos different convolution graphs and their resulting accuracy.
Going deeper
After conquering MNIST you'll need a lot of images (you can get them from imageNet) and a lot of GPU (to run all your training) and a software setup so that you can not only run and test your model, but dozens (if not hundreds) of variations to explore your hyper parameters (like learning rate, convolution size, dropout, etc). Remember, it took a team of leading edge Machine Learning experts to create something like Inception, many many months (possibly years) of iteration to find the model they use today, and thousands of CPU/GPU hours.
If you are trying to understand what is going on and what makes a good graph, then trying to recreate Inception is a great idea. If you just want an excellent Image recognition model, then reuse an existing one.
If you are trying to have fun, just do it!
Cheers-

Multiple pretrained networks in Caffe

Is there a simple way (e.g. without modifying caffe code) to load wights from multiple pretrained networks into one network? The network contains some layers with same dimensions and names as both pretrained networks.
I am trying to achieve this using NVidia DIGITS and Caffe.
EDIT: I thought it wouldn't be possible to do it directly from DIGITS, as confirmed by answers. Can anyone suggest a simple way to modify the DIGITS code to be able to select multiple pretrained networks? I checked the code a bit, and thought the training script would be a good place to start, but I don't have in-depth knowledge of Caffe, so I'm not sure what the best/quickest way to achieve this would be.
As Shai suggested, there was no way of doing this, so I decided to clone the official repository and make the appropriate changes. I changed the code so that multiple pretrained networks can be loaded by using a colon as separator.
I created a pull request on the official repository and my changes were then merged with the main branch of DIGITS, meaning it is now possible to use this functionality in DIGITS.
AFAIK there is no straight forward way of doing so.
However, you can use net surgery to load the pretrained models and manually assign their weights to the target net. Once you have a single net with all the weights initialized according to the various pretrained models, you can save it and use it as a single pretrained model for the rest of your work.