Assistance needed to explain output regression model - regression

Please find my output below:
Call:
lm(formula = Saves ~ `Perceived material` * (Hue + Saturation +
Value), data = ALL_DATA)
Residuals:
Min 1Q Median 3Q Max
-189.64 -84.01 -58.50 20.86 916.28
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 68.57024 49.77586 1.378 0.1686
`Perceived material`Metal 64.78830 93.42042 0.694 0.4881
`Perceived material`Other -28.85379 345.03697 -0.084 0.9334
`Perceived material`Paper-based -1.60833 59.18130 -0.027 0.9783
`Perceived material`Plastic -9.84939 63.24575 -0.156 0.8763
Hue -0.35445 0.12766 -2.776 0.0056 **
Saturation -0.07202 0.55582 -0.130 0.8969
Value 1.23305 0.55682 2.214 0.0270 *
`Perceived material`Metal:Hue 0.34229 0.19750 1.733 0.0834 .
`Perceived material`Other:Hue 0.30847 0.82227 0.375 0.7076
`Perceived material`Paper-based:Hue 0.37635 0.15085 2.495 0.0128 *
`Perceived material`Plastic:Hue 0.36114 0.15827 2.282 0.0227 *
`Perceived material`Metal:Saturation 0.26360 0.77023 0.342 0.7322
`Perceived material`Other:Saturation 0.14805 3.05534 0.048 0.9614
`Perceived material`Paper-based:Saturation -0.06672 0.63166 -0.106 0.9159
`Perceived material`Plastic:Saturation 0.26501 0.65924 0.402 0.6878
`Perceived material`Metal:Value -1.75033 1.04335 -1.678 0.0937 .
`Perceived material`Other:Value -1.47734 3.77538 -0.391 0.6957
`Perceived material`Paper-based:Value -0.93587 0.66170 -1.414 0.1576
`Perceived material`Plastic:Value -0.55704 0.70333 -0.792 0.4285
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 152.9 on 1034 degrees of freedom
(29 observations deleted due to missingness)
Multiple R-squared: 0.02869, Adjusted R-squared: 0.01084
F-statistic: 1.607 on 19 and 1034 DF, p-value: 0.04757
I would like some help with interpreting this table, more specifically: starting from
`Perceived material`Metal:Hue
Hue (Independent variable) on its own is negative and significant (β = -0.35445), meaning that a lower Hue will lead to more Saves (Dependent Variable).
Why then is for example Perceived material Paper-based:Hue a positive number (β = 0.37635)? Am I right in saying that the effect of a lower Hue value 'evens out' if the material is paper-based?
Thanks for your help in advance.

Related

glmer - odds ratios & interpreting a binary outcome

Dear Members of the community,
It is my understanding that the summary() of a lmer model outputs values that are in line with the unit of measure specified by the nature of the outcome (e.g. reaction time in milliseconds for a given trial);
But what of glmer models where the outcome is strictly binary ? (e.g. 0/1 score for a given trial) ?
I read that in such a case the concept of odd-ratios allows to quantify results on a logit scale but am unsure how to implement this for the current example :
> summary(recsem.full)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [
glmerMod]
Family: binomial ( logit )
Formula: score ~ 1 + modality + session + (1 + modality | PID) + (1 +
modality | stim)
Data: recsem
Control: glmerControl(optimizer = "nmkbw")
AIC BIC logLik deviance df.resid
1634.7 1685.5 -808.4 1616.7 2063
Scaled residuals:
Min 1Q Median 3Q Max
-7.3513 0.1192 0.2441 0.4284 2.1665
Random effects:
Groups Name Variance Std.Dev. Corr
PID (Intercept) 1.8173 1.3481
modality 1.1752 1.0841 -0.44
stim (Intercept) 0.5507 0.7421
modality 0.2508 0.5008 0.66
Number of obs: 2072, groups: PID, 37; stim, 28
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.1191 0.2972 7.129 1.01e-12 ***
modality 0.6690 0.2815 2.376 0.01749 *
session -0.4239 0.1302 -3.255 0.00113 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) modlty
modality -0.269
session -0.251 -0.008
Looking at the fixed effects my interpretation is that both predictors are significant but how do I interpret / transform the estimates into something more tangible ?
Thank you very much in advance for any insights you might be able to share;
Have a nice day !

Variance of the prediction of a two-point change after linear regression

This may be an absolutely basic question, so forgive me if it is.
Let's say I have the following regression output
Call:
lm(formula = y ~ x1 + x2, data = fake)
Residuals:
Min 1Q Median 3Q Max
-2.9434 -0.6851 0.0231 0.6744 3.6313
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.130431 0.056489 -2.309 0.0211 *
x1 0.014597 0.003454 4.226 2.59e-05 ***
x2 -0.025518 0.062429 -0.409 0.6828
---
Now, I want to predict y from new data. I'm specifically interested in the error around my prediction.
Let's say that one row in the new observed dataset has
x1 = 2
x2 = 0
The expected value for y_pred is
y_pred = -0.1304 + 2 * 0.01460
...but what is the standard deviation of that prediction? Can I figure out 95% CIs on that prediction?
And specifically, would I figure those out by applying the std. error of beta_1 twice, once for each unit increase, or do I apply it only once?
EDIT to add: I don't have the original data, just the coefficient estimate and SE...so calculating using matrix algebra won't be possible.

Using linear approximation to perform addition and subtraction | error barrier

I'm attempting my first solo project, after taking an introductory course to machine learning, where I'm trying to use linear approximation to predict the outcome of addition/subtraction of two numbers.
I have 3 features: first number, subtraction/addition (0 or 1), and second number.
So my input looks something like this:
3 0 1
4 1 2
3 0 3
With corresponding output like this:
2
6
0
I have (I think) successfully implemented logistic regression algorithm, as the squared error does gradually decrease, but in 100 values, ranging from 0 to 50, the squared error value flattens out at around 685.6 after about 400 iterations.
Graph: Squared Error vs Iterations
.
To fix this, I have tried using a larger dataset for training, getting rid of regularization, and normalizing the input values.
I know that one of the steps to fix high bias is to add complexity to the approximation, but I want to maximize the performance at this particular level. Is it possible to go any further on this level?
My linear approximation code in Octave:
% Iterate
for i = 1 : iter
% hypothesis
h = X * Theta;
% reg theta prep
regTheta = Theta;
regTheta(:, 1) = 0;
% cost calc
J(i, 2) = (1 / (2 * m)) * (sum((h - y) .^ 2) + lambda * sum(sum(regTheta .^ 2,1),2));
% theta calc
Theta = Theta - (alpha / m) * ((h - y)' * X)' + lambda * sum(sum(regTheta, 1), 2);
end
Note: I'm using 0 for lambda, as to ignore regularization.

Matlab plotting the shifted logistic function

I would like to plot the shifted logistic function as shown from Wolfram Alpha.
In particular, I would like the function to be of the form
y = exp(x - t) / (1 + exp(x - t))
where t > 0. In the link, for example, t is 6. I had originally tried the following:
x = 0:.1:12;
y = exp(x - 6) ./ (1 + exp(x - 6));
plot(x, y);
axis([0 6 0 1])
However, this is not the same as the result from Wolfram Alpha. Here is an export of my plot.
I do not understand what the difference is between what I am trying to do here vs. plotting shifted sin and cosine functions (which works using the same technique).
I am not completely new to Matlab but I do not usually use it in this way.
Edit: My values for x in the code should have been from 0 to 12.
fplot takes as inputs a function handle and a range to plot for:
>> fplot(#(x) exp(x-6) / (1 + exp(x-6)), [0 12])
The beauty of fplot in this case is you don't need to spend time calculating y-values beforehand; you could also extract values from the graph after the fact if you want (by getting the line handle's XData and YData properties).
Your input to Wolfram Alpha is incorrect. It is interpreted as e*(x-6)/(1-e*(x-6)). Use plot y = exp(x - 6) / (1 + exp(x - 6)) for x from 0 to 12 in Wolfram Alpha (see here) for the same results as in MATLAB. Also use axis([0 12 0 1]) (or no axis statement at all on a new plot) to see the full results in MATLAB.
In reply to your comment: use y = exp(1)*(x - 6) ./ (1 + exp(1)*(x - 6)); to do in MATLAB what you were doing in Wolfram Alpha.

Can someone explain the behavior of the functions mkpp and ppval?

If I do the following in MATLAB:
ppval(mkpp(1:2, [1 0 0 0]),1.5)
ans = 0.12500
This should construct a polynomial f(x) = x^3 and evaluate it at x = 1.5. So why does it give me the result 1.5^3 = .125? Now, if I change the domain defined in the first argument to mkpp, I get this:
> ppval(mkpp([1 1.5 2], [[1 0 0 0]; [1 0 0 0]]), 1.5)
ans = 0
So without changing the function, I change the answer. Awesome.
Can anyone explain what's going on here? How does changing the first argument to mkpp change the result I get?
The function MKPP will shift the polynomial so that x = 0 will start at the beginning of the corresponding range you give it. In your first example, the polynomial x^3 is shifted to the range [1 2], so if you want to evaluate the polynomial at an unshifted range of [0 1], you would have to do the following:
>> pp = mkpp(1:2,[1 0 0 0]); %# Your polynomial
>> ppval(pp,1.5+pp.breaks(1)) %# Shift evaluation point by the range start
ans =
3.3750 %# The answer you expect
In your second example, you have one polynomial x^3 shifted to the range [1 1.5] and another polynomial x^3 shifted to the range of [1.5 2]. Evaluating the piecewise polynomial at x = 1.5 gives you a value of zero, occurring at the start of the second polynomial.
It may help to visualize the polynomials you are making as follows:
x = linspace(0,3,100); %# A vector of x values
pp1 = mkpp([1 2],[1 0 0 0]); %# Your first piecewise polynomial
pp2 = mkpp([1 1.5 2],[1 0 0 0; 1 0 0 0]); %# Your second piecewise polynomial
subplot(1,2,1); %# Make a subplot
plot(x,ppval(pp1,x)); %# Evaluate and plot pp1 at all x
title('First Example'); %# Add a title
subplot(1,2,2); %# Make another subplot
plot(x,ppval(pp2,x)); %# Evaluate and plot pp2 at all x
axis([0 3 -1 8]) %# Adjust the axes ranges
title('Second Example'); %# Add a title