How looks the syntax of a regression with a quadratic term and interaction in SPSS? In R the code would be:
fit <- lm(c ~ a*b + a*I(b^2), dat)
or
fit <- lm(c ~ a*(b+I(b^2), dat)
Thanks for help.
Using REGRESSION you need to actually make the variables in the SPSS data file before submitting the command. So if your variables were named the same:
COMPUTE ab = a*b. /*Interaction*/.
COMPUTE bsq = b**2. /*squared term*/.
COMPUTE absq = a*bsq. /*Interaction with squared term*/.
Then these can be placed on the right hand side of your regression equation.
REGRESSION VARIABLES=a,b,bsq,absq,c
/DEPENDENT=c
/METHOD=ENTER a,b,bsq,absq.
I thought you could only do factor variables for the interactions - but I was wrong, you can do continuous variables as well (sorry!). Here is an example using MIXED (still you need to make the seperate variables if using REGRESSION).
INPUT PROGRAM.
LOOP Case = 1 TO 200000.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
COMPUTE a = RV.BERNOULLI(0.5).
COMPUTE b = RV.NORMAL(0,1).
COMPUTE ab = a*b /*Interaction*/.
COMPUTE bsq = b**2 /*squared term*/.
COMPUTE absq = a*bsq /*Interaction with squared term*/.
COMPUTE c = 0.5 + 0.2*a + 0.1*b -0.05*ab + .03*bsq -.001*absq + RV.NORMAL(0,1).
VARIABLE LEVEL a (NOMINAL).
RECODE a (0 = 2)(ELSE = COPY).
MIXED c BY a WITH b bsq
/FIXED = a b b*b a*b
/PRINT SOLUTION.
Related
I'm new to octave and if this as been asked and answered then I'm sorry but I have no idea what the phrase is for what I'm looking for.
I trying to remove the DC component from a large matrix, but in chunks as I need to do calculations on each chuck.
What I got so far
r = dlmread('test.csv',';',0,0);
x = r(:,2);
y = r(:,3); % we work on the 3rd column
d = 1
while d <= (length(y) - 256)
e = y(d:d+256);
avg = sum(e) / length(e);
k(d:d+256) = e - avg; % this is the part I need help with, how to get the chunk with the right value into the matrix
d += 256;
endwhile
% to check the result I like to see it
plot(x, k, '.');
if I change the line into:
k(d:d+256) = e - 1024;
it works perfectly.
I know there is something like an element-wise operation, but if I use e .- avg I get this:
warning: the '.-' operator was deprecated in version 7
and it still doesn't do what I expect.
I must be missing something, any suggestions?
GNU Octave, version 7.2.0 on Linux(Manjaro).
Never mind the code works as expected.
The result (K) got corrupted because the chosen chunk size was too small for my signal. Changing 256 to 4096 got me a better result.
+ and - are always element-wise. Beware that d:d+256 are 257 elements, not 256. So if then you increment d by 256, you have one overlaying point.
I have solved a differential equation with a neural net. I leave code below with an example. I want to be able to compute the first derivative of this neural net with respect to its input "x" and evaluate this derivative for any "x".
1- Notice that I compute der = discretize.derivative . Is that the derivative of the neural net with respect to "x"? With this expression, if I type [first(der(phi, u, [x], 0.00001, 1, res.minimizer)) for x in xs] I get something that I wonder if it is the derivative but I cannot find a way to extract this in an array, let alone plot this. How can I evaluate this derivative at any point, lets say for all points in the array defined below as "xs"? Below in Update I give a more straightforward approach I took to try to compute the derivative (but still did not succeed).
2- Is there any other way that I could take the derivative with respect to x of the neural net?
I am new to Julia, so I am struggling a bit with how to manipulate the data types. Thanks for any suggestions!
Update: I found a way to see the symbolic expression for the neural net doing the following:
predict(x) = first(phi(x,res.minimizer))
df(x) = gradient(predict, x)[1]
After running the two lines of code type predict(x) or df(x) in the REPL and it will spit out the full neural net with the weights and biases of the solution. However I cannot evaluate the gradient, it spits an error. How can I evaluate the gradient with respect to x of my function predict(x)??
The original code creating the neural net and solving the equation
using NeuralPDE, Flux, ModelingToolkit, GalacticOptim, Optim, DiffEqFlux
import ModelingToolkit: Interval, infimum, supremum
#parameters x
#variables u(..)
Dx = Differential(x)
a = 0.5
eq = Dx(u(x)) ~ -log(x*a)
# Initial and boundary conditions
bcs = [u(0.) ~ 0.01]
# Space and time domains
domains = [x ∈ Interval(0.01,1.0)]
# Neural network
n = 15
chain = FastChain(FastDense(1,n,tanh),FastDense(n,1))
discretization = PhysicsInformedNN(chain, QuasiRandomTraining(100))
#named pde_system = PDESystem(eq,bcs,domains,[x],[u(x)])
prob = discretize(pde_system,discretization)
const losses = []
cb = function (p,l)
push!(losses, l)
if length(losses)%100==0
println("Current loss after $(length(losses)) iterations: $(losses[end])")
end
return false
end
res = GalacticOptim.solve(prob, ADAM(0.01); cb = cb, maxiters=300)
prob = remake(prob,u0=res.minimizer)
res = GalacticOptim.solve(prob,BFGS(); cb = cb, maxiters=1000)
phi = discretization.phi
der = discretization.derivative
using Plots
analytic_sol_func(x) = (1.0+log(1/a))*x-x*log(x)
dx = 0.05
xs = LinRange(0.01,1.0,50)
u_real = [analytic_sol_func(x) for x in xs]
u_predict = [first(phi(x,res.minimizer)) for x in xs]
x_plot = collect(xs)
xconst = analytic_sol_func(1)*ones(size(xs))
plot(x_plot ,u_real,title = "Solution",linewidth=3)
plot!(x_plot ,u_predict,line =:dashdot,linewidth=2)
The solution I found consists in differentiating the approximation with the help of ForwardDiff.
So if the neural network approximation to the unkown function is called "funcres" then we take its derivative with respect to x as shown below.
using ForwardDiff
funcres(x) = first(phi(x,res.minimizer))
dxu = ForwardDiff.derivative.(funcres, Array(x_plot))
display(plot(x_plot,dxu,title = "Derivative",linewidth=3))
I have a simple model where I want to minimize the RMSE between my dependent variable y and my model values. The model is: y = alpha + beta'*x.
For minimization, I am using Matlab's fmincon function and am struggling with multiplying my parameter p(2) by x.
MWE:
% data
y = [5.072, 7.1588, 7.263, 4.255, 6.282, 6.9118, 4.044, 7.2595, 6.898, 4.8744, 6.5179, 7.3434, 5.4316, 3.38, 5.464, 5.90, 6.80, 6.193, 6.070, 5.737]
x = [18.3447, 79.86538, 85.09788, 10.5211, 44.4556, 69.567, 8.960, 86.197, 66.857, 16.875, 52.2697, 93.971, 24.35, 5.118, 25.126, 34.037, 61.4445, 42.704, 39.531, 29.988]
% initial values
p_initial = [0, 0];
% function: SEE BELOW
objective = #(p) sqrt(mean((y - y_mod(p)).^2));
% optimization
[param_opt, fval] = fmincon(objective, p_initial)
If I specify my function as follows then it works.
y_mod = #(p) p(1) + p(2).*x
However, it does not work if I use the following code. How can I multiply p(2) with x? Where x is not optimized, because the values are given.
function f = y_mod(p)
f = p(1) + p(2).*x
end
Here is the output from a script that has the function declaration:
>> modelFitExample2a
RMS Error=0.374, intercept=4.208, slope=0.0388
And here is code for the above. It has many commented lines because it includes alternate ways to fit the data: an inline declaration of y_mod(), or a multi-line declaration of y_mod(), or no y_mod() at all. This version uses the multi-line declaration of y_mod().
%modelFitExample2a.m WCR 2021-01-19
%Reply to stack exchange question on parameter fitting
clear;
global x %need this if define y_mod() separately, and in that case y_mod() must declare x global
% data
y = [5.0720, 7.1588, 7.2630, 4.2550, 6.2820, 6.9118, 4.0440, 7.2595, 6.8980, 4.8744...
6.5179, 7.3434, 5.4316, 3.3800, 5.4640, 5.9000, 6.8000, 6.1930, 6.0700, 5.7370];
x = [18.3447,79.8654,85.0979,10.5211,44.4556,69.5670, 8.9600,86.1970,66.8570,16.8750,...
52.2697,93.9710,24.3500, 5.1180,25.1260,34.0370,61.4445,42.7040,39.5310,29.9880];
% initial values
p_initial = [0, 0];
%predictive model with parameter p
%y_mod = #(p) p(1) + p(2)*x;
% objective function
%If you use y_mod(), then you must define it somewhere
objective = #(p) sqrt(mean((y - y_mod(p)).^2));
%objective = #(p) sqrt(mean((y-p(1)-p(2)*x).^2));
% optimization
options = optimset('Display','Notify');
[param_opt, fval] = fmincon(objective,p_initial,[],[],[],[],[],[],[],options);
% display results
fprintf('RMS Error=%.3f, intercept=%.3f, slope=%.4f\n',...
fval,param_opt(1),param_opt(2));
%function declaration: predictive model
%This is an alternative to the inline definition of y_mod() above.
function f = y_mod(p)
global x
f = p(1) + p(2)*x;
end
carl,
The second method, in which you declare y_mod() explicitly (at the end of your script, or in a separate file y_mod.m), does not work because y_mod() does not know what x is. Fix it by declaring x global in the main program at the top, and declare x global in y_mod().
%function declaration
function f = y_mod(p)
global x
f = p(1) + p(2)*x;
end
Of course you don't need y_mod() at all. The code also works if you use the following, and in this case, no global x is needed:
% objective function
objective = #(p) sqrt(mean((y-p(1)-p(2)*x).^2));
By the way, you don't need to multiply with .* in y_mod. You may use *, because you are multiplying a scalar by a vector.
32-bit Octave has a limit on the maximum number of elements in an array. I have recompiled from source (following the script at https://github.com/calaba/octave-3.8.2-enable-64-ubuntu-14.04 ), and now have 64-bit indexing.
Nevertheless, when I attempt to perform elementwise multiplication using a broadcast function, I get error: out of memory or dimension too large for Octave's index type
Is this a bug, or an undocumented feature? If it's a bug, does anyone have a reasonably efficient workaround?
Minimal code to reproduce the problem:
function indexerror();
% both of these are formed without error
% a = zeros (2^32, 1, 'int8');
% b = zeros (1024*1024*1024*3, 1, 'int8');
% sizemax % returns 9223372036854775806
nnz = 1000 % number of non-zero elements
rowmax = 250000
colmax = 100000
irow = zeros(1,nnz);
icol = zeros(1,nnz);
for ind =1:nnz
irow(ind) = round(rowmax/nnz*ind);
icol(ind) = round(colmax/nnz*ind);
end
sparseMat = sparse(irow,icol,1,rowmax,colmax);
% column vector to be broadcast
broad = 1:rowmax;
broad = broad(:);
% this gives "dimension too large" error
toobig = bsxfun(#times,sparseMat,broad);
% so does this
toobig2 = sparse(repmat(broad,1,size(sparseMat,2)));
mult = sparse( sparseMat .* toobig2 ); % never made it this far
end
EDIT:
Well, I have an inefficient workaround. It's slower than using bsxfun by a factor of 3 or so (depending on the details), but it's better than having to sort through the error in the libraries. Hope someone finds this useful some day.
% loop over rows, instead of using bsxfun
mult_loop = sparse([],[],[],rowmax,colmax);
for ind =1:length(broad);
mult_loop(ind,:) = broad(ind) * sparseMat(ind,:);
end
The unfortunate answer is that yes, this is a bug. Apparently #bsxfun and repmat are returning full matrices rather than sparse. Bug has been filed here:
http://savannah.gnu.org/bugs/index.php?47175
I have used Gnuplot to plot my data, along with a linear regression line. Currently, the 'title' of this line, which has its equation calculated by Gnuplot, is just "f(x)". However, I would like the title to be the equation of the regression line, e.g. "y=mx+c".
I can do this manually by reading off 'm' and 'c' from the plotting info output, then re-plot with the new title. I would like this process to be automated, and was wondering if this can be done, and how to go about doing it.
With a data file Data.csv:
0 0.00000
1 1.00000
2 1.41421
3 1.73205
4 2.00000
5 2.23607
you can do a linear fitting with:
f(x) = a*x + b
fit f(x) 'Data.csv' u 1:2 via a, b
You can use what I think is called a macro in gnuplot to set the title in the legend of you identified function f(x) with
title_f(a,b) = sprintf('f(x) = %.2fx + %.2f', a, b)
Now in order to plot the data with the regression function f(x) simply do:
plot "Data.csv" u 1:2 w l, f(x) t title_f(a,b)
You should end up with this plot:
From Correlation coefficient on gnuplot :
Another, perhaps slightly shorter way than Woltan's of doing the same thing may be:
# This command will analyze your data and set STATS_* variables. See help stats
stats Data.csv
f(x) = STATS_slope * x + STATS_intercept
plot f(x) title sprintf("y=%.2fx+%.2f", STATS_slope, STATS_intercept)