How to do Transfer Learning with LSTM for time series forecasting? - deep-learning

I am working on a project about time-series forecasting using LSTMs layers. The dataset used for training and testing the model was collected among 443 persons which worn a sensor that samples a physical variable ( 1 variable/measure) every 5 minutes, for each patient there are around 5000 records/readings.
Although, I can train and test my model under different scenarios, I am troubled finding information about how to apply transfer learning in such an architecture. I mean, I understand I can use inductive transfer-learning by copying the matrix-weights from the general model onto a secondary model (unknown person), then after I can re-train this model with specific data and evaluate the result.
But I would like to know if somebody knows other ways to apply transfer-learning on this type of architecture or where to find information about it since there aren't many scientific papers talking about it, mostly they talk about NLP and other type of application but time series?
Cheers X )

Related

How to evaluate the deep learning time series forecasting models?

I am working on a long-term time series (wind speed) forecasting model with different deep learning algorithms. I am using MLP, CNN, and LSTM. I have several questions, and I would appreciate it if you could answer them.
-Do I have to do any preprocessing for seasonality for these deep learning models?
Why is my R-square so bad and sometimes negative?
When I plot the predicted model on the train or test, it is obvious that the model is not good since it is like a straight line and does not capture the trend. However, my evaluation parameters are really good. For example, the RMSE, MAE, and MAPE are 0.77, 0.67, and 0.1, respectively. So is it enough to just report these parameters as many articles have?
And the last one, is it possible to use the proposed model for different datasets? Is it reasonable if I use another city wind speed dataset with a different pattern and trend on this model? Because I have seen many articles that have done it and my models are not working on different datasets.

6D pose estimation of a known 3D CAD object with limited model training for a new object

I'm working on a project where I need to estimate the 6DOF pose of a known 3D CAD object in a single RGB image - i.e. this task: https://paperswithcode.com/task/6d-pose-estimation. There are several constraints on the problem:
Usable commercially (licensed under BSD, MIT, BOOST, etc.), not GPL.
The CAD object is known and we do NOT aim for generality (i.e.recognize the class of all chairs).
The CAD object can be uploaded by a user, so it may have symmetries and a range of textures.
Inference step will be run on a smartphone, and should be able to run at >30fps.
The inference step can either be a) find the pose of the object once and then I can write code to continue to track it or b) find the pose of the object continuously. I.e. the model doesn't need to have any continuous refinement steps after the initial pose estimate is found.
Can be anywhere on the scale of single instance of a single object to multiple instances of multiple objects (MiMo). MiMO is preferred, but not required.
If a deep learning approach is used, the training time required for a new CAD object should be on the order of hours, not days.
Can either 1) just find the initial pose of an object and not have any refinement steps after or 2) find the initial pose of the object and also have refinement steps after.
I am open to traditional approaches (i.e. 2D->3D correspondences then solving with PnP), but it seems like deep learning approaches outperform them (classical are too slow - Real time 6D pose estimation of known 3D CAD objects from a single 2D image or point clouds from RGBD Camera when objects are one on top of the other?). Looking at deep learning approaches (poseCNN, HybridPose, Pix2Pose, CosyPose), it seems most of them match these constraints, except that they require model training time. Though perhaps I can use a single pre-trained model and then specialize it for each new CAD object with a shorter training step. But I am not sure of this, and I think success probably relies on the specific model chosen. For example, this project says it requires 3 hours of training time: https://github.com/DLR-RM/AugmentedAutoencoder.
So, my question: would somebody know what the state of the art, commercially usable implementation that doesn't require extensive training time for a new CAD object is?

Fine tuning ssd mobilenet

I am currently working on vehicle detection using ssd mobile net TensorFlow API. I have made a custom dataset from coco dataset which comprises of all the vehicle categories in coco i.e. car, bicycle, motorcycle, bus, truck, and also I have a dataset of 730 rickshaw images.
Ultimately my goal is to detect rickshaws along with other vehicles as well. But so far I have failed.
There are a total of 16000 instances in the train_labels.csv on average each class has 2300 instances. I have set the batch size = 12. Then I train the coco pre-trained model on my custom dataset for 12000 steps.
But unfortunately I have not been able to get good results. After training it failed to classify other vehicles.
Any advice regarding the ratio of each class in the dataset, or maybe I need more rickshaw images, how many layers should I freeze? Or may be a different perspective would be highly appreciated.
Since you have a custom dataset of 730 rickshaw images, I think there is no need to extract different dataset of other vehicles from COCO dataset for fine tuning.
What I meant is the tensorflow pretrained model is really good at detecting all other vehicles than the rickshaw. Your task is just to teach the model, how to detect rickshaw.
Another option is since you already have a vehicle dataset, you can try training a model using checkpoints from COCO.
https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9
Go through the above article, it will give you a fair idea about start to end flow. Author has tuned ssd mobilenet model trained on coco dataset to detect raccoon images. The raccoon was the only new class author wanted to detect. In your case, you just have to replace raccoon by rickshaw images and follow exact same steps. Author of this has used Google cloud but you can change the config file to tune it on a local machine. Considering you have only 730 new images, tuning it shouldn't take time.
This is another good example in case things are not clear https://towardsdatascience.com/building-a-toy-detector-with-tensorflow-object-detection-api-63c0fdf2ac95
Coming to your question about do you need more data, more data is always better. What I would suggest is tune model using steps above and check mAP. If you think mAP is low and the performance for your intended application is not enough, collect more data and tune again.
Please let me know if you have any questions.

Training a neural network with two completely different datasets.

I am working with Neural network for object classification right now. I am working on creating datasets for training and validation. I want to know if it is possible to create two datasets for training comprising of two completely different objects and labels. (EG dataset 1 has cars and dataset 2 has cats) Does it still work or should I create datasets where each file has mixed up both the different object types and labels in all the training files? Does such mixture/separation matter if I am training the network in one cycle with different datasets?
Depending on what you are using to train, many APIs (such as TensorFlow object detection) read the TF Record files (datasets) in order as they are scrambled to make the files beforehand. Scrambling is quite important with training as you will get the model starting training with one class, and then train for a bit with another individual class. It should get to the same standard eventually, but it is a lot better for the model to train with an equal distribution of classes of the training steps.

Is it possible to forward the output of a deep-learning network to another network with caffe / pycaffe?

I am using caffe, or more likely pycaffe to train and create my network. I am having a dataset with 5 labels at the end. I had the idea to create one network for each label that can just simply say the score for one class. After having then trained 5 networks I want to compare the outputs of the networks and which one has the highest score.
Sadly I do only know how to create one network , but not how to let them interact and moreover how to do something like a max function at the end. I add a picture to describe what I want to do.
Moreover, I do not know if this would have a better outcome than just a normal deep neuronal network.
I don't see what you expect to have as the input to this "max" function. Even if you use some sort of is / is not boundary training, your approach appears to be an inferior version of the softmax layer available in all popular frameworks.
Yes, you can build a multi-channel model, train each channel with a different data set, and then accept the most confident prediction -- but the result will take longer and be less accurate than a cooperative training pass. Your five channels wind up negotiating their boundaries after they've made other parametric assumptions.
Feed a single model all the information available from the outset; you'll get faster convergence and more accurate classification.