Conditional GAN (pix2pix) OR CycleGAN? - deep-learning

I have a dataset including paired MRI and CT of patients. My aim is generating synthetic CT from MRI images. As I have paired images, which GAN network is the best for this purpose? CycleGAN OR pix2pix? Which one result in a synthetic CT with a higher quality? Can I use CycleGAN to feed the model with paired images in an unpaired manner? Does CycleGAN has any advantages over the pix2pix for my purpose?
Any advice would be highly appreciated 🙏

We both don't know that. But you can make conditional CycleGAN to control paired images. In my case, the dataset decided the quality of image by reduce the number of bad samples. Both pix2pix and CycleGAN can work well. If you focused on higher resolution (sharper but noisier), you can choose ResNet as Generator. If your task was segmentation, I think U-Net is better to use (https://biomedical-engineering-online.biomedcentral.com/articles/10.1186/s12938-019-0682-x)

Related

Best Neural Network architecture for traditional large multiclass classification problem

I am new to deep learning (I just finished to read deep learning with pytorch), and I was wondering what is the best neural network architecture for my case.
I have a large multiclass classification problem (user identification problem), about 1000 classes in which each class is a user. I have about 2000 features for each user after one-hot encoding and cleaning. Data are highly imbalanced, but I can always use oversampling/downsampling techniques.
I was wondering what is the best architecture to implement for my case. I've always seen deep learning applied to time series or images, so I'm not sure about what to use in this case. I was thinking about a multi-layer perceptron but maybe there are better solutions.
Thanks for your tips and help. Have a nice day!
You can try triplet learning instead of simple classification.
From your 1000 users, you can make, c * 1000 * 999 / 2 pairs. c is the average number of samples per class/user.
https://arxiv.org/pdf/1412.6622.pdf

Is it theoretically reasonable to use CNN for data like categorical and numeric data?

I'm trying to use CNN to do a binary classification.
As CNN shows its strength in feature extraction, it has been many uses for pattern data like image and voice.
However, the dataset I have is not image or voice data, but categorical data and numerical data, which are different from this case.
My question is as follows.
In this situation, Is it theoretically reasonable to use CNN for data in this configuration?
If it is reasonable, would it be reasonable to artificially place my dataset in a two-dimensional form and perform a 2D-CNN?
I often see examples of using CNN in many classifiers through Kaggle and various media, and I can see not only images and voices, but also numerical and categorical data like mine.
I really wonder this is theoretically a problem, and I would appreciate it if you could recommend it if you knew about the related paper or research.
I'm looking forward to hearing any advice about this situation. Thank you for your answer.
CNNs for images apply kernels to neighboring pixels and blocks of image. CNNs for audio work on spectrograms, i.e. use input data proximity as well.
If your data inputs has some sort of closeness (e.g. time-series, graph...), then CNN might be useful.

Train a reinforcement learning model with a large amount of images

I am tentatively trying to train a deep reinforcement learning model the maze escaping task, and each time it takes one image as the input (e.g., a different "maze").
Suppose I have about 10K different maze images, and the ideal case is that after training N mazes, my model would do a good job to quickly solve the puzzle in the rest 10K - N images.
I am writing to inquire some good idea/empirical evidences on how to select a good N for the training task.
And in general, how should I estimate and enhance the ability of "transfer learning" of my reinforcement model? Make it more generalized?
Any advice or suggestions would be appreciate it very much. Thanks.
Firstly,
I strongly recommend you to use 2D arrays for the maps of the mazes instead of images, it would do your model a huge favor, becuse it's a more feature extracted approach. try using 2D arrays in which walls are demonstrated by ones upon the ground of zeros.
And about finding the optimized N:
Your model architecture is way more important than the share of training data in all of the data or the batch sizes. It's better to make a well designed model and then to find the optimized amount of N by testing different Ns(becuse it is only one variable, the process of optimizing N can be easily done by you yourself).

Adding noise to image for deep learning, yes or no?

I've been thinking that adding noise to an image can prevent overfitting and also "increase" the dataset by adding variations to it. I'm only trying to add some random 1s to images that has shape (256,256,3) which uses uint8 to represent its color. I don't think that can affect the visualization at all (I showed both images with matplotlib and they seems almost the same) and has only ~0.01 mean difference in the sum of their values.
But it doesn't look to have its advances. After training for a long time it's still not as good as the one doesn't use noises.
Has anyone tried to use noise for image classification tasks like this? Is it eventually better?
I wouldn't go to add noise to your data. Some papers employ input deformations during training to increase robutness and convergence speed of models. However, these deformations are statistically inefficient (not just on image but any kind of data).
You can read Intriguing properties of Neural Networks from Szegedy et al. for more details (and refer to references 9 & 13 for papers that uses deformations).
If you want to avoid overfitting, you might be interested to read about regularization instead.
Yes you may add noise to extend your dataset and avoid overfitting your training set but make sure it is random otherwise your network will take this noise as something it should learn (and that's not something you want). I wouldn't use this method first to do that, I would first rotate and/or flip my samples.
However, your network should perform better or, at least, as well as your previous network.
First thing I would check is : How do you measure your performances ? What were your performances before and after ? And did you change anything else ?
There are a couple of works that deal with this problem. Because you make the training set harder the training error will be lower, however your generalization might be better. It has been shown that adding noise can have stability effects for training Generative Adversarial Networks (Adversarial Training).
For classification tasks it is not that cut and dry. Not many works have actually dealt with this topic. The closest one is to my best knowledge is this one from google (https://arxiv.org/pdf/1412.6572.pdf), where they show the limitation of using training without noise. They do report a regularization effect, but not actual better results than using other methods.

Training model to recognize one specific object (or scene)

I am trying to train a learning model to recognize one specific scene. For example, say I would like to train it to recognize pictures taken at an amusement park and I already have 10 thousand pictures taken at an amusement park. I would like to train this model with those pictures so that it would be able to give a score for other pictures of the probability that they were taken at an amusement park. How do I do that?
Considering this is an image recognition problem, I would probably use a convolutional neural network, but I am not quite sure how to train it in this case.
Thanks!
There are several possible ways. The most trivial one is to collect a large number of negative examples (images from other places) and train a two-class model.
The second approach would be to train a network to extract meaningful low-dimensional representations from an input image (embeddings). Here you can use siamese training to explicitly train the network to learn similarities between images. Such an approach is employed for face recognition, for instance (see FaceNet). Having such embeddings, you can use some well-established methods for outlier detections, for instance, 1-class SVM, or any other classifier. In this case you also need negative examples.
I would heavily augment your data using image cropping - it is the most obvious way to increase the amount of training data in your case.
In general, your success in this task strongly depends on the task statement (are restricted to parks only, or any kind of place) and the proper data.