I am having 3 different datasets, 3 of them were all blood smear image stained with the same chemical substance. Blood smear images are images that capture your blood, include Red, White blood cells inside.
The first dataset contain 2 classes : normal vs blood cancer
The second dataset contain 2 classes: normal vs blood infection
The third dataset contain 2 classes: normal vs sickle cell disease
So, what i want to do is : when i input a blood smear image, the AI system will tell me whether it was : normal , or blood cancer or blood infection or sickle cell disease (4 classes classification task)
What should i do?
Should i mix these 3 datasets and train only 1 model to detect 4 classes ?
Or should i train 3 different models and them combine them? If yes, what method should i use to combine?
Update : i searched for a while. Can this task called "Learning without forgetting?"
I think it depends on the data.
You may use three different models and make three binary predictions on each image. So you get a vote (probability) for each x vs. normal. If binary classifications are accurate, this should deliver okay results. But you kind of get a cummulated missclassification or error in this case.
If you can afford, you can train a four class model and compare the test error to the series of binary classifications. I understand that you already have three models. So training another one may be not too expensive.
If ONLY one of the classes can occur, a four class model might be the way to go. If in fact two (or more) classes can occur jointly, a series of binary classifications would make sense.
As #Peter said it is totally data dependent. If the images of the 4 classes, namely normal ,blood cancer ,blood infection ,sickle cell disease are easily distinguishable with your naked eyes and there is no scope of confusion among all the classes then you should simply go for 1 model which gives out probabilities of all the 4 classes(as mentioned by #maxi marufo). If there is confusion between classes and the images are NOT distinguishable with naked eyes or there is a lot of scope of confusion between the classes then you should use 3 different models but then you'll need. You simply get the predicted probabilities from all the 3 models suppose p1(normal) and p1(c1), p2(normal) and p2(c2), p3(normal) and p3(c3). Now you can average(p1(normal),p2(normal),p3(normal)) and the use a softmax for p(normal), p1(c1), p2(c2), p3(c3) . Out of multiple ways you could try, the above could be one.
This is a multiclass classification problem. You can train just one model, with the final layer being a full connected (dense) layer of 4 units (i.e. output dimension) and softmax activation function.
Related
After following some tutorials on LSTM networks, I've decided to put my knowledge in practice by training a LSTM model on my own dataset.
Here is a view of my data:
As you can observe, I have same number of samples and labels.
Let's say that I have 10 samples and 10 labels for those samples and I want to split those samples in 2 timesteps.
After spliting I would have 5 samples, each having 2 timesteps, but I would still have 10 labels.
Am I right?
How you guys deal with this problem?
If I'm trying to feed the data in this form, I will get a "Data cardinality is ambiguous" exception.
In an LSTM, every input sequence has one one label (in the case of simple classification, at least). So in your case you would have your data be two samples of the position, and then a single label.
I am trying to construct a RNN to predict the possibility of a player playing the match along with the runs score and wickets taken by the player.I would use a LSTM so that performance in current match would influence player's future selection.
Architecture summary:
Input features: Match details - Venue, teams involved, team batting first
Input samples: Player roster of both teams.
Output:
Discrete: Binary: Did the player play.
Discrete: Wickets taken.
Continous: Runs scored.
Continous: Balls bowled.
Question:
Most often RNN uses "Softmax" or"MSE" in the final layers to process "a" from LSTM -providing only a single variable "Y" as output. But here there are four dependant variables( 2 Discrete and 2 Continuous). Is it possible to stitch together all four as output variables?
If yes, how do we handle mix of continuous and discrete outputs with loss function?
(Though the output from LSTM "a" has multiple features and carries the information to the next time-slot, we need multiple features at output for training based on the ground-truth)
You just do it. Without more detail on the software (if any) in use it is hard to give more detasmail
The output of the LSTM unit is at every times step on of the hidden layers of your network
You can then input it in to 4 output layers.
1 sigmoid
2 i'ld messarfound wuth this abit. Maybe 4x sigmoid(4 wickets to an innnings right?) Or relu4
3,4 linear (squarijng it is as lso an option,e or relu)
For training purposes your loss function is the sum of your 4 individual losses.
Since f they were all MSE you could concatenat your 4 outputs before calculating the loss.
But sincd the first is cross-entropy (for a decision sigmoid) yould calculate seperately and sum.
You can still concatenate them after to have a output vector
I am currently looking into multi-labeling classification and I have some questions (and I couldn't find clear answers).
For the sake of clarity let's take an example : I want to classify images of vehicles (car, bus, truck, ...) and their make (Audi, Volkswagen, Ferrari, ...).
So I thought about training two independant CNN (one for the "type" classification and one fore the "make" classifiaction) but I thought it might be possible to train only one CNN on all the classes.
I read that people tend to use sigmoid function instead of softmax to do that. I understand that sigmoid does not sum up to 1 like softmax does but I dont understand in what doing that enables to do multi-labeling classification ?
My second question is : Is it possible to take into account that some classes are completly independant ?
Thridly, in term of performances (accuracy and time to give the classification for a new image), isn't training two independant better ?
Thank you for those who could give my some answers or some ideas :)
Softmax is a special output function; it forces the output vector to have a single large value. Now, training neural networks works by calculating an output vector, comparing that to a target vector, and back-propagating the error. There's no reason to restrict your target vector to a single large value, and for multi-labeling you'd use a 1.0 target for every label that applies. But in that case, using a softmax for the output layer will cause unintended differences between output and target, differences that are then back-propagated.
For the second part: you define the target vectors; you can encode any sort of dependency you like there.
Finally, no - a combined network performs better than the two halves would do independently. You'd only run two networks in parallel when there's a difference in network layout, e.g. a regular NN and CNN in parallel might be viable.
I have data with integer target class in the range 1-5 where one is the lowest and five the highest. In this case, should I consider it as regression problem and have one node in the output layer?
My way of handling it is:
1- first I convert the labels to binary class matrix
labels = to_categorical(np.asarray(labels))
2- in the output layer, I have five nodes
main_output = Dense(5, activation='sigmoid', name='main_output')(x)
3- I use 'categorical_crossentropy with mean_squared_error when compiling
model.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['mean_squared_error'],loss_weights=[0.2])
Also, can anyone tells me: what is the difference between using categorical_accuracy and 'mean_squared_error in this case?
Regression and classification are vastly different things. If you reimagine this as a regression task than the difference of predicting 2 when the ground truth is 4 will be rated more than if you predict 3 instead of 4. If you have class like car, animal, person you do not care for the ranking between those classes. Predicting car is just as wrong as animal, iff the image shows a person.
Metrics do not impact your learning at all. It is just something that is computed additionally to the loss to show the performance of the model. Here the accuracy makes sense, because this is mostly the metric that we care about. Mean squared error does not tell you how well your model performs. If you get something like 0.0015 mean squared error it sounds good, but it is hard to visualize just how well this performs. In contrast using accuracy and achieving 95% accuracy for example is meaningful.
One last thing you should use softmax instead of sigmoid as your final output to get a probability distribution in your final layer. Softmax will output percentages for every class that sum up to 1. Then crossentropy calculates the difference of the probability distribution of your network output and the ground truth.
I need to develop a neural network and classify the inputs into 3 categories. One of the category is "Don't Know"
Should I train the network using a single output perceptron which categories the training examples as 1,2, or 3? Or should I use a 2 output perceptron and use a binary scheme (01, 10, 00/11) to classify the inputs?
You should use 3 output neurons (one for each class). In the training phase, set output of neuron representing correct class to 1 and all others to 0. Single output with 1 2 and 3 is not optimal because that contains implicit assumtion that classes 2 and 3 are somehow "closer" to each other then 1 and 3. 2 outputs with binary coding is also not good, because in addition to solving classification problem you NN will have to learn binary encoding.
Also, its probably best to use softmax activation on output layer with cross-entropy error function. Softmax will normalize output, so values at each neuron could be interpreted as class probabilities.
Note that "don't know" class in only useful if you have training examples labeled as "don't know". Otherwise, use two output neurons.