Multicollinearitytest for binary logistic regression in in R? - binary

How can multicollinearity be tested for a binary logistic Regression in R?
I tried correlation tables but I dont exactly know what to do.

Related

Using OLS regression on binary outcome variable

I have previously been told that -- for reasons that make complete sense -- one shouldn't run OLS regressions when the outcome variable is binary (i.e. yes/no, true/false, win/loss, etc). However, I often read papers in economics/other social sciences in which researchers run OLS regressions on binary variables and interpret the coefficients just like they would for a continuous outcome variable. A few questions about this:
Why do they not run a logistic regression? Is there any disadvantage/limitation to using logit models? In economics, for example, I very often see papers using OLS regression for binary variable and not logit. Can logit only be used in certain situations?
In general, when can one run an OLS regression on ordinal data? If I have a variable that captures "number of times in a week survey respondent does X", can I - in any circumstance - use it as a dependent variable in a linear regression? I often see this being done in literature as well, even though we're always told in introductory statistics/econometrics that outcome variables in an OLS regression should be continuous.
The application of applying OLS to a binary outcome is called Linear Probability Model. Compared to a logistic model, LPM has advantages in terms of implementation and interpretation that make it an appealing option for researchers conducting impact analysis. In LPM, parameters represent mean marginal effects while parameters represent log odds ratio in logistic regression. To calculate the mean marginal effects in logistic regression, we need calculate that derivative for every data point and then
calculate the mean of those derivatives. While logistic regression and the LPM usually yield the same expected average impact estimate[1], researchers prefer LPM for estimating treatment impacts.
In general, yes, we can definitely apply OLS to an ordinal outcome. Similar to the previous case, applying OLS to a binary or ordinal outcome result in violations of the assumptions of OLS. However, within econometrics, they believe the practical effect of violating these assumptions is minor and that the simplicity of interpreting an OLS outweighs the technical correctness of an ordered logit or probit model, especially when the ordinal outcome looks quasi-normal.
Reference:
[1] Deke, J. (2014). Using the linear probability model to estimate impacts on binary outcomes in randomized controlled trials. Mathematica Policy Research.

Predicting rare events and their strength with LSTM autoencoder

I’m currently creating and LSTM to predict rare events. I’ve seen this paper which suggest: first an autoencoder LSTM for extracting features and second to use the embeddings for a second LSTM that will make the actual prediction. According to them, the autoencoder extract features (this is usually true) which are then useful for the prediction layers to predict.
In my case, I need to predict if it would be or not an extreme event (this is the most important thing) and then how strong is gonna be. Following their advice, I’ve created the model, but instead of adding one LSTM from embeddings to predictions I add two. One for binary prediction (It is, or it is not), ending with a sigmoid layer, and the second one for predicting how strong will be. Then I have three losses. The reconstruction loss (MSE), the prediction loss (MSE), and the binary loss (Binary Entropy).
The thing is that I’m not sure that is learning anything… the binary loss keeps in 0.5, and even the reconstruction loss is not really good. And of course, the bad thing is that the time series is plenty of 0, and some numbers from 1 to 10, so definitely MSE is not a good metric.
What do you think about this approach?
This is the better architecture for predicting rare events? Which one would be better?
Should I add some CNN or FC from the embeddings before the other to LSTM, for extracting 1D patterns from the embedding, or directly to make the prediction?
Should the LSTM that predicts be just one? And only use MSE loss?
Would be a good idea to multiply the two predictions to force in both cases the predicted days without the event coincide?
Thanks,

Keras pass data through layers explicitly

I am trying to implement a Pairwise Learning to rank model with keras where features are being computed by deep neural network.
In the pairwise L2R model, while training, I am giving the query, one positive and one negative result. And it is trained on the classification loss by difference of feature vector.
I am able to do compile and fit model successfully but the problem is to actually use this model on test data.
As in Pairwise L2R model, at testing time I would have only query and sample pair (no separate negative and positives). And I can use the calculated value before softmax to rank samples.
Is there any way I can use keras to pass data manually at test time through particular trained layers. (In short I have 3 set of inputs at train time and 2 at testing time.)

Naive Bayes For Regression

I was wondering, if I can apply naive bayes, to a regression problem and how will it be done. I have 4096 image features and 384 text features and, it won't be very bad if I assume independence between them. Can anyone tell me how to proceed?
Naive bayes is used for strings and numbers(categorically)
it can be used for classification so it can be either 1 or 0 nothing in between like 0.5 (regression)
Even if we force naive bayes and tweak it a little bit for regression the result is disappointing; A team experimented with this and achieve not so good results.
Also in wikipedia naivebayes has closeness to logistic regression.
Relation to logistic regression:
naive Bayes classifier can be considered a way of fitting a probability model that optimizes the joint likelihood p(C , x), while logistic regression fits the same probability model to optimize the conditional p(C | x).
So now you have two choices, tweak naive bayes formula or use logistic regression.
I say lets use logistic regression instead of reinventing the wheel.
References:
Wikipedia:
https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Relation_to_logistic_regression
Naive Bayes Regression Experiment: https://link.springer.com/content/pdf/10.1023%2FA%3A1007670802811.pdf
Naive bayes doesn't make sense to me as a regression algorithm. Random forest regression might be a better fit for your problem. It should be able to handle mixed text and image features.

Adjusting Binary Logistic Formula in SPSS

I am running a binary logistic regression in SPSS, to test the effect of e.g. TV advertisements on the probability of a consumer to buy a product. My problem is that with the formula of binary logistic regression:
P=1/(1+e^(-(a+b*Adv)) )
the maximum probability will be equal to 100%. However,even if I increase the number of advertisements by 1000, it is not sensible to assume that the probability to purchase will be 100%. So if I draw the graph of the logistic regression with the coefficients from the Binary Logistic Regression, at some point the probability reaches 100%, which is never the case in a real life setting. How can I control for that?
Is there a way to change the SPSS binary logistic regression to have a maximum probability of e.g. 20%?
Thank you!
The maximum hypothetical probability is 100%, but if you use real-world data, your model will fit the data in such a way that the predicted y-value for any given value of x will be no higher than the real-world y-value (+/- your model's error term). I wouldn't worry too much about the hypothetical maximum probability as long as my model fit the data reasonably well. One of the key reasons for using logistic regressions instead of OLS linear regressions is to avoid impossible predicted values.