Expanding CNN with GAN methodology in caffe - deep-learning

I have a network which estimates the depth_map from an input image. In a nutshell, I have an input_image and its corresponding ground_truth (depth map). Let us call this network a generator network. So far so good. Now I have heard of ´Generative Adversarial Networksand I thought I could improve my network with adding aDiscriminator`-Network as follows:
input -> neural network -> estimated_depth_image -> discriminator -> output: true or false depending on real or synthesised
^
|
ground_truth_depth_image
But then, how would I switch between feeding the estimated_depth_image with label 0 and feeding the ground_truth_image with label 1 into the discriminator. Is that even possible? If yes, how would you approach it?
My problem is, how do I feed my new dataset which is consists of all the groud_truth depth images with label 1 and all the estimated_deth_images with label 0 into the discriminator network at the same time using caffe?

As far as I know this is not easy to do in Caffe, because you would need two different optimizers (one for the generator and one for the discriminator). You also need to do the forward pass only in G, and then in D using the output of G and some real depth data. I would recommend to use Tensorflow, torch or pytorch to do that.

Related

How can you increase the accuracy of ResNet50?

I'm using Resnet50 model to classify images into two classes: normal cells and cancer cells.
so I want to to increase the accuracy but i don't know what to modify.
# we are using resnet50 for transfer learnin here. So we have imported it
from tensorflow.keras.applications import resnet50
# initializing model with weights='imagenet'i.e. we are carring its original weights
model_name='resnet50'
base_model=resnet50.ResNet50(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max')
last_layer=base_model.output # we are taking last layer of the model
# Add flatten layer: we are extending Neural Network by adding flattn layer
flatten=layers.Flatten()(last_layer)
# Add dense layer
dense1=layers.Dense(100,activation='relu')(flatten)
# Add dense layer to the final output layer
output_layer=layers.Dense(class_count,activation='softmax')(flatten)
# Creating modle with input and output layer
model=Model(inputs=base_model.inputs,outputs=output_layer)
model.compile(Adamax(learning_rate=.001), loss='categorical_crossentropy', metrics=['accuracy'])
There were 48 errors in 534 test cases Model accuracy= 91.01 %
Also what do you think about the results of the graph?
this is the classification report
i got good results but is there a possibility to increase accuracy more than that?
This is a broad question as there are many ways one can attempt to generally improve the network's accuracy. some of which may be
Increase the dimension of the layers that are learned in transfer learning (make sure not to overfit)
Use transfer learning with Convolution layers and not MLP
let the optimization algorithm choose the learning rate on its own
Play with additional augmentations to the dataset
and the list goes on.
Also, if possible, I would suggest comparing your results to other publicly available benchmarks - by doing so you might understand the upper bounds of the accuracies better

Is it possible to use pytorch for real-time input data with post-processing

I'm constructing Projector-camera system, I want to build radiometric compensation for it using deep-learning.
Here, Is it possible to use network as below? (I guess gradient does not flow, thus weights will not be updated, but I cannot sure)
0. I have ground truth image GT. set Input_image = GT
While True:
1. Encoder-Decoder network structure : projection_image = network(Input_image)
2. project projection_image and capture it as Cap
3. loss calculation : loss = RMSE(Cap, GT)
4. Input_image = projection_image
For this situation,
If I assume ordinary deep-learning, the loss will be calculated between direct output of the network (projection_image) and ground truth data GT. Of course, it works.
However for my case, I want to calculate loss between post-processed network output (network output image -> projection -> capture) and GT.
Here, post-processing is done by cpu, I guess loss does not affect network weights. Actually In my code, the network did not updated.
Is it possible to solve my problem?

Tensorflow Multiple Input Loss Function

I am trying to implement a CNN in Tensorflow (quite similar architecture to VGG), which then splits into two branches after the first fully connected layer. It follows this paper: https://arxiv.org/abs/1612.01697
Each of the two branches of the network outputs a set of 32 numbers. I want to write a joint loss function, which will take 3 inputs:
The predictions of branch 1 (y)
The predictions of branch 2 (alpha)
The labels Y (ground truth) (q)
and calculate a weighted loss, as in the image below:
Loss function definition
q_hat = tf.divide(tf.reduce_sum(tf.multiply(alpha, y),0), tf.reduce_sum(alpha,0))
loss = tf.abs(tf.subtract(q_hat, q))
I understand the fact that I need to use the tf functions in order to implement this loss function. Having implemented the above function, the network is training, but once trained, it is not outputting the expected results.
Has anyone ever tried combining outputs of two branches of a network in one joint loss function? Is this something TensorFlow supports? Maybe I am making a mistake somewhere here? Any help whatsoever would be greatly appreciated. Let me know if you would like me to add any further details.
From TensorFlow perspective, there is absolutely no difference between a "regular" CNN graph and a "branched" graph. For TensorFlow, it is just a graph that needs to be executed. So, TensorFlow certainly supports this. "Combining two branches into joint loss" is also nothing special. In fact, it is "good" that loss depends on both branches. It means that when you ask TensorFlow to compute loss, it will have to do the forward pass through both branches, which is what you want.
One thing I noticed is that your code for loss is different than the image. Your code appears to do this https://ibb.co/kbEH95

Can I use autoencoder for clustering?

In the below code, they use autoencoder as supervised clustering or classification because they have data labels.
http://amunategui.github.io/anomaly-detection-h2o/
But, can I use autoencoder to cluster data if I did not have its labels.?
Regards
The deep-learning autoencoder is always unsupervised learning. The "supervised" part of the article you link to is to evaluate how well it did.
The following example (taken from ch.7 of my book, Practical Machine Learning with H2O, where I try all the H2O unsupervised algorithms on the same data set - please excuse the plug) takes 563 features, and tries to encode them into just two hidden nodes.
m <- h2o.deeplearning(
2:564, training_frame = tfidf,
hidden = c(2), auto-encoder = T, activation = "Tanh"
)
f <- h2o.deepfeatures(m, tfidf, layer = 1)
The second command there extracts the hidden node weights. f is a data frame, with two numeric columns, and one row for every row in the tfidf source data. I chose just two hidden nodes so that I could plot the clusters:
Results will change on each run. You can (maybe) get better results with stacked auto-encoders, or using more hidden nodes (but then you cannot plot them). Here I felt the results were limited by the data.
BTW, I made the above plot with this code:
d <- as.matrix(f[1:30,]) #Just first 30, to avoid over-cluttering
labels <- as.vector(tfidf[1:30, 1])
plot(d, pch = 17) #Triangle
text(d, labels, pos = 3) #pos=3 means above
(P.S. The original data came from Brandon Rose's excellent article on using NLTK. )
In some aspects encoding data and clustering data share some overlapping theory. As a result, you can use Autoencoders to cluster(encode) data.
A simple example to visualize is if you have a set of training data that you suspect has two primary classes. Such as voter history data for republicans and democrats. If you take an Autoencoder and encode it to two dimensions then plot it on a scatter plot, this clustering becomes more clear. Below is a sample result from one of my models. You can see a noticeable split between the two classes as well as a bit of expected overlap.
The code can be found here
This method does not require only two binary classes, you could also train on as many different classes as you wish. Two polarized classes is just easier to visualize.
This method is not limited to two output dimensions, that was just for plotting convenience. In fact, you may find it difficult to meaningfully map certain, large dimension spaces to such a small space.
In cases where the encoded (clustered) layer is larger in dimension it is not as clear to "visualize" feature clusters. This is where it gets a bit more difficult, as you'll have to use some form of supervised learning to map the encoded(clustered) features to your training labels.
A couple ways to determine what class features belong to is to pump the data into knn-clustering algorithm. Or, what I prefer to do is to take the encoded vectors and pass them to a standard back-error propagation neural network. Note that depending on your data you may find that just pumping the data straight into your back-propagation neural network is sufficient.

Need hint for the Exercise posed in the Tensorflow Convolution Neural Networks Tutorial

Below is the exercise question posed on this page https://www.tensorflow.org/versions/0.6.0/tutorials/deep_cnn/index.html
EXERCISE: The output of inference are un-normalized logits. Try
editing the network architecture to return normalized predictions
using tf.softmax().
In the spirit of the exercise, I want to know if I'm on the right-track (not looking for the coded-up answer).
Here's my proposed solution.
Step 1: The last layer (of the inference) in the example is a "softmax_linear", i.e., it simply does the unnormalized WX+b transformation. As stipulated, we apply the tf.nn.softmax operation with softmax_linear as input. This normalizes the output as probabilities on the range [0, 1].
Step 2: The next step is to modify the cross-entropy calculation in the loss-function. Since we already have normalized output, we need to replace the tf.nn.softmax_cross_entropy_with_logits operation with a plain cross_entropy(normalized_softmax, labels) function (that does not further normalize the output before calculating the loss). I believe this function is not available in the tensorflow library; it needs to be written.
That's it. Feedback is kindly solicited.
Step 1 is more then sufficient if you insert the tf.nn.softmax() in cifar10_eval.py (and not in cifar10.py). For example:
logits = cifar10.inference(images)
normalized_logits = tf.nn.softmax(logits)
top_k_op = tf.nn.in_top_k(normalized_logits, labels, 1)