how to generate scatter plots between outcome variable and one of independent variable - regression

plot(x, y) can generate scatter plots for simple regression results y=a +bx; then how to generate scatter plots between outcome variable and one of independent variable for regression results Y=A+bX1 + cX2 +dX3? can we run y=a1 +cX2 +dX3 to get residuals, and then plot residuals ~ X1?

Yes. Syntax doesn't change. Just remember that if you are working with R, syntax should be
plot(model$residuals, variable)
Where "model" is your linear model
By the way, if you want to store residuals in your dataset, then you should type:
db$errors <- model$residual
Where "db" is your dataset

Related

PyTorch - Neural Network - Output single scalar value

Let's say we have the following neural network in PyTorch
seq_model = nn.Sequential(
nn.Linear(1, 13),
nn.Tanh(),
nn.Linear(13, 1))
With the following input tensor
input = torch.tensor([1.0, 1.0, 5.0], dtype=torch.float32).unsqueeze(1)
I can run forward through the net and get
seq_model(input)
tensor([[-0.0165],
[-0.0165],
[-0.2289]], grad_fn=<TanhBackward0>)
Probably I also can get a single scalar value as an output, but I'm not sure how.
Thank you. I'm trying to use such an network for reinforcment learning, and use it
as an value function approximator for game board state evaluation.
The first dimension of input represents the number of observations in your minibatch (3), the second dimension represents instead the number of features (1).
If you want to forward a single 3d input, the network must be modified (nn.Linear(1, 13) becomes nn.Linear(3, 13)), and you must remove unsqueeze(1) on input. Otherwise, you can merge the three outputs by using a loss to compute a single scalar from them.

What exactly is a Softmax output layer?

I'm trying to make a simple conv net in c#, and I want to make a Softmax outputlayer, but I don't really now what it is. Is it a fully connected layer with Softmax activation or just a layer which outputs the softmax of the data?
Softmax is just a function that takes a vector and outputs a vector of the same size having values within the range [0,1]. Also the values inside the vector follow the fundamental rule of probability ie. sum of values in vector = 1.
softmax(x)_i = exp(x_i) / ( SUM_{j=1}^K exp(x_j) ) # for each i = 1,.., K
But sometimes people use Softmax classifier which refers to a MLP with input and 1 output layer (which makes it a linear classifier like linear SVM) where softmax function is applied to the outputs of output layer. This setup gives the probability of the input being close to each of the output classes.

Predicting continuous valued output

I am working on predicting Semantic Textual Similarity (SemEval 2017 Task-1) between a pair of texts. The similarity score (output) is a continuous value between [0,5]. The neural network model (link below), therefore, has 6 units in the final layer for prediction between values [0,5]. The objective function used is the Pearson correlation coefficient and softmax activation is used. Now, in order to train the model, how can I give the target output values to the model? Since there are 6 output classes, I should probably send one-hot-encoded vectors of the output. In that case, how can we convert the output (which might be a float value such as 2.33) to a one-hot vector of length 6? Or is there any other way of specifying the target output and training the model?
Paper: http://nlp.arizona.edu/SemEval-2017/pdf/SemEval016.pdf
If the value you're trying to predict is continuously-defined, you might be better off configuring this as a regression architecture. This will be simpler to train and interpret and will give you non-integer predictions (which you can then bucket or threshold however you please).
In order to do this, replace your softmax layer with a layer containing a single neuron with a linear activation function. Then you can simply train this network using your real-valued similarity numbers at the output. For loss function, you can use MSE / L2 unless you have a reason to do otherwise.

How to get residuals as variable in multiple regression in R?

I am trying to extract regression residuals as a variable to my data so I could use them for different analysis in mydataset.
I use the following code:
fresid<-lm(var1 ~ var2+var3+var4+var5, data=female)
This gives me a table to the global enviroment, but A am unable to get fitting residuals for my data.

Tune input features using backprop in keras

I am trying to implement discriminant condition codes in Keras as proposed in
Xue, Shaofei, et al., "Fast adaptation of deep neural network based
on discriminant codes for speech recognition."
The main idea is you encode each condition as an input parameter and let the network learn dependency between the condition and the feature-label mapping. On a new dataset instead of adapting the entire network you just tune these weights using backprop. For example say my network looks like this
X ---->|----|
|DNN |----> Y
Z --- >|----|
X: features Y: labels Z:condition codes
Now given a pretrained DNN, and X',Y' on a new dataset I am trying to estimate the Z' using backprop that will minimize prediction error on Y'. The math seems straightforward except I am not sure how to implement this in keras without having access to the backprop itself.
For instance, can I add an Input() layer with trainable=True with all other layers set to trainable= False. Can backprop in keras update more than just layer weights? Or is there a way to hack keras layers to do this?
Any suggestions welcome.
thanks
I figured out how to do this (exactly) in Keras by looking at fchollet's post here
Using the keras backend I was able to compute the gradient of my loss w.r.t to Z directly and used it to drive the update.
Code below:
import keras.backend as K
import numpy as np
model.summary() #Pretrained model
loss = K.categorical_crossentropy(Y, Y_out)
grads = K.gradients(loss, Z)
grads /= (K.sqrt(K.mean(K.square(grads)))+ 1e-5)
iterate = K.function([X,Z],[loss,grads])
step = 0.1
Z_adapt = Z_in.copy()
for i in range(100):
loss_val, grads_val = iterate([X_in,Z_adapt])
Z_adapt -= grads_val[0] * step
print "iter:",i,np.mean(loss_value)
print "Before:"
print model.evaluate([X_in, Z_in],Y_out)
print "After:"
print model.evaluate([X_in, Z_adapt],Y_out)
X,Y,Z are nodes in the model graph. Z_in is an initial value for Z'. I set it to an average value from the train set. Z_adapt is after 100 iterations of gradient descent and should give you a better result.
Assume that the size of Z is m x n. Then you can first define an input layer of size m * n x 1. The input will be an m * n x 1 vector of ones. You can define a dense layer containing m * n neurons and set trainable = True for it. The response of this layer will give you a flattened version of Z. Reshape it appropriately and give it as input to the rest of the network that can be appended ahead of this.
Keep in mind that if the size of Z is too large, then network may not be able to learn a dense layer of that many neurons. In that case, maybe you need to put additional constraints or look into convolutional layers. However, convolutional layers will put some constraints on Z.