Coherence Value of 0.3 in LDA Model - lda

I am having some confusion over the use of coherence score in evaluating LDA model.
I had run an LDA model on a dataset and have obtained coherence score ranging from 0.32 to 0.37 and perplexity score ranging from -6.75 to -6.77 for a various number of topics.
I am using the LDA model in the gensim package and this is the code which I use to calculate the coherence score.
coherencemodel = CoherenceModel(model=lda_model, texts=texts, dictionary=id2word,
coherence='c_v')
coherenceScore = coherencemodel.get_coherence()
I had always understood that the use of coherence score is to find the optimal number of topics used in the LDA model. But I was also told that a coherence score of 0.3 is bad.
Can someone kindly explain what is coherence score used for and does a score 0.3 signifies a bad model?
And when we are comparing between different LDA models, which is the better evaluation method, perplexity or coherence score?

Related

How to speed up YoloV5 TFLite inference time in phone app?

I am using the YoloV5 model for custom object recognition, and when I export it to tflite model for inclusion in the mobile app, the resulting time to object recognition is 5201.2ms inference. How can I reduce the inference to optimal for faster recognition? The dataset I use to train is 2200 images and use the model yolov5x to train. Thank for help me !!
You have several options:
Train a smaller Yolo model (m instead of x, for example)
Resize the images (640x640 to for example 320x320, notice that the dimension need to be a multiple of the maximum stride which is 32)
Quantize the model to FP16 or INT8
Use NNAPI delegate (only provides speedup if the CPU contains any HW accelerator: GPU, DSP, NN engine)
None of these options exclude each other, all can be used at the same time for maximum inference speed. 1, 2 & 3 will sacrifice model performance for inference speed.

Using AUC score as a monitor for saving checkpoints of the model during train/val period

Suppose we train the model in Deep learning. After each epoch we can calculate validation metrics, after which we choose when we will save the model. I read about AUC (area under the curve) for binary classification. Now, what I read is this: " ROC is a probability curve and AUC represents the degree or measure of separability. It tells how much the model is capable of distinguishing between classes. Higher the AUC, the better the model is at predicting 0 classes as 0 and 1 classes as 1."
What I personally see is that when we have a larger AUC value, the model is better. So, why not using the same as a metric during train/val period as a monitor of when to save weights? Is there any reason why we shouldn't use it?

Hyperparameter search for lunarlander continuous of openAI gym

I'm trying to solve the LunarLander continuous environment from open AI gym (Solving the LunarLanderContinuous-v2 means getting an average reward of 200 over 100 consecutive trials.) With best reward average possible for 100 straight episodes from this environment.
The difficulty is that I refer to the Lunar-lander with uncertainty. (explanation: observations in the real physical world are sometimes noisy). Specifically, I add a zero-mean
Gaussian noise with mean=0 and std = 0.05 to PositionX and PositionY observation of the location of the lander.
I also discretise the LunarLander actions to a finite number of actions instead of the continuous range the environment enables.
So far I'm using DQN, double-DQN and Duelling DDQN.
My hyperparameters are:
gamma,
epsilon start
epsilon end
epsilon decay
learning rate
number of actions (discretisation)
target update
batch size
optimizer
number of episodes
network architecture.
I'm having difficulty to reach good or even mediocre results.
Does someone have an advice about the hyperparameters changes I should make to improve my results?
Thanks!

how can i get the topic coherence score of two models and then use it for comparison?

I want to get the topic coherence for the LDA model. Let's say I have two LDA model one with a bag of words and the second one with a bag of phrases. how I can get the coherence for these two models and then compare them on the basis of coherence?
For two separate models you can just check coherence separately. You should post some code but this is how to check coherence:
# Compute Coherence Score
coherence_model_ldamallet = CoherenceModel(model=ldamallet, texts=processed_docs, dictionary=dictionary, coherence='c_v')
coherence_ldamallet = coherence_model_ldamallet.get_coherence()
print('\nCoherence Score: ', coherence_ldamallet)
If you want a comparison check out the elbow method for optimizing coherence: 17 I hope this helps

Loss function for ordinal target on SoftMax over Logistic Regression

I am using Pylearn2 OR Caffe to build a deep network. My target is ordered nominal. I am trying to find a proper loss function but cannot find any in Pylearn2 or Caffe.
I read a paper "Loss Functions for Preference Levels: Regression with Discrete Ordered Labels" . I get the general idea - but I am not sure I understand what will the thresholds be, if my final layer is a SoftMax over Logistic Regression (outputting probabilities).
Can some help me by pointing to any implementation of such a loss function ?
Thanks
Regards
For both pylearn2 and caffe, your labels will need to be 0-4 instead of 1-5...it's just the way they work. The output layer will be 5 units, each is a essentially a logistic unit...and the softmax can be thought of as an adaptor that normalizes the final outputs. But "softmax" is commonly used as an output type. When training, the value of any individual unit is rarely ever exactly 0.0 or 1.0...it's always a distribution across your units - which log-loss can be calculated on. This loss is used to compare against the "perfect" case and the error is back-propped to update your network weights. Note that a raw output from PL2 or Caffe is not a specific digit 0,1,2,3, or 5...it's 5 number, each associated to the likelihood of each of the 5 classes. When classifying, one just takes the class with the highest value as the 'winner'.
I'll try to give an example...
say I have a 3 class problem, I train a network with a 3 unit softmax.
the first unit represents the first class, second the second and third, third.
Say I feed a test case through and get...
0.25, 0.5, 0.25 ...0.5 is the highest, so a classifier would say "2". this is the softmax output...it makes sure the sum of the output units is one.
You should have a look at ordinal (logistic) regression. This is the formal solution to the problem setup you describe ( do not use plain regression as the distance measures of errors are wrong).
https://stats.stackexchange.com/questions/140061/how-to-set-up-neural-network-to-output-ordinal-data
In particular I recommend looking at Coral ordinal regression implementation at
https://github.com/ck37/coral-ordinal/issues.