What is the best approach to generating prediction of heatmaps from fixation maps? - deep-learning

If I have an image (a) and corresponding fixation heatmaps (c) for those images, what model would be best for generating predicted heatmaps for images.
I have tried implementing a U-Net segmentation model where the inputs are both (a) and (c). However, the model does not output the predicted heatmaps. It only tries to segment the original image (a). Is a segmentation approach the best solution to this problem?

Related

PyTorch Geometric GCN Autoencoder with Flat Latent Space

I have a problem in which I have a series of observations, each of which is a graph of the same structure, but with different node features. I would like to learn a flat embedding of each graph of size 32x1.
My thought was to do this with an autoencoder. This would take the input graph, apply some graph convolutions, use a dense layer to map the graph to a 32x1 latent space, and then reconstruct the graph (using the same common structure) before applying a few more convolutions.
As far as I am aware, this is in contrast to the typical graph autoencoder framework, in which the latent representation is a graph of the same structure as the input but with latent representations of each nodes' features.
For this reason, I am not sure how to implement such an architecture using PyTorch Geometric. Namely, I am unsure how I go from the flat latent space back to a graph.
Is this possible, and if so, roughly how would I do so?

What is the ideal steps in using predicted segmentation masks for watershed post processing?

I am experimenting with object segmentation(round shaped objects that are often occur close together). I have used UNET deep neural network architecture for segmentation and obtained segmentation masks. I saved those in npy format.
I am a beginner in this area. I would like to know the ideal steps that I should follow now, if I want to apply watershed on the predicted masks with the aim of separating the objects.
I guess I need to convert the binary mask predicted to some form so that I can obtain some kind of markers indicating centroids.
Please help

3D annotation for instance segmentation

I'm trying to annotate some data for 3D instance segmentation. While it's fairly straightforward to draw masks for each 2D plane, it's not obvious how to connect the same "instances" together post-annotation (ie. connect the "red" masks together, connect the "blue" masks together) without laboriously making sure the instances are instance-matched (ie. colour-coded to make sure "red" masks always connect with "red" masks).
A naive approach I have thought of is to make many 2D segmentation masks, and calculate the center of mass for each object detected. I can later re-assign the instances based on the closest matching center of mass, but I worry this would inadvertently generate "crossed-over" segmentation instances (illustrated below). What are some high-throughput strategies to generate 3D annotations?
The boundary of your 2-d slices could be used as constraints to obtain the optimal 3-d surface, as proposed in 1.
However, I think it is easier to generate 3-d labels from markers, such as 2. Its implementation is available in here (Fill free open an issue if you encounter any problems :P).
Also, the napari package could be useful to develop the GUI without much effort.
[1] Grady, Leo. "Minimal surfaces extend shortest path segmentation methods to 3D." IEEE Transactions on Pattern Analysis and Machine Intelligence 32.2 (2008): 321-334.
[2] Falcão, Alexandre X., and Felipe PG Bergo. "Interactive volume segmentation with differential image foresting transforms." IEEE Transactions on Medical Imaging 23.9 (2004): 1100-1108.
You can use 3D Slicer's Segment Editor. It is free, open-source, has many built-in tools, and customizable/extensible in Python or C++ (you can plug in your own segmentation method with minimal effort). To solve a segmentation task, typically you first figure out a good segmentation workflow (what tools to use, in what combination and what parameters) using interactive GUI, then if necessary you can make it semi-automatic or fully automatic using Python scripting.
You can create a segmentation by contouring every image slice, but it would be too tedious. Instead, you can use 3D region growing (Grow from seeds effect) or segment on just a few slices and interpolate between them (Fill between slices effect).

How to design the Embedding Layer in Neural Network in order to have a better quality?

Recently I am learning about the ideal about the embedding layer in neural networks. The best explanation I found so far is here The explanation there well addressed the core concept of why to use embedding layer and how it works.
It also mentioned that our embedding will map similar words to similar region. And thus the quality of our embedding representation is how close or similar that a group of similar representations from original space is in embedding space. But I really have no ideal of how to do it.
My question is, how to design the weight matrix in order to have a better embedding representation that is customised for specific dataset ?
Any hint would be really helpful to me!
Thank you all!
Suppose you know some concepts of neural networks and Word2Vec, I try to explain things briefly.
1, the weight matrix in the embedding layer is often randomly initialized just like weights in other types of neural networks layers.
2, the weight matrix in the embedding layer transforms the sparse input into a dense vector as explained in the post you mentioned.
3, the weight matrix in the embedding layer can be updated during the training process using your dataset along the backpropagation.
Therefore, after training, the learned weight matrix should give you better representations of your specific data. Just like how word embedding works, more data often yields better representations in the embedding layer. Another factor is the number of dimension(generally speaking, the higher dimension, the more degrees of freedom the model will have to learn the representations of the features).

Why does Convolutional network need multiple feature maps?

I am a beginner of deep learning. For convolutional networks such as lenet-5, there are 6 feature maps in the C1 layer. Each feature map is associated with a unique convolution kernel (5x5 matrix).
What is the difference between any 2 feature maps in the same layer? For a black-white image dataset like MNIST (without RGB), people still use 6 feature maps.
I guess, initially, the 6 convolution kernels are randomly generated 5x5 matrices. Therefore, when the same input image is projected to different feature maps, the output of feature maps will be different. And this is the main motivation, right?
Every filter in your convolutional layer extracts a specific feature from the input. One filter could be sensitive to horizontal edges while another is sensitive to vertical edges. A third filter may be sensitive to a triangular shape. You want the feature maps to be as different from each other as possible to avoid redundancy. Avoiding redundancy improves the network's capacity to as many variations in the data as possible.
Random initialization prevents learning duplicate filters.
Why 6 feature maps? This is a result of trying out other numbers of filters. Keep in mind that increasing the number of filters results in higher computational overhead and possibly overfitting (memorizing the training data but not good at classifying new images correctly). Another intuition for 6 is that there's not that much variation in pixels, you'll eventually extract more complex features in subsequent layers. 6 feature maps for C1 ended up working well for the MNIST dataset.