Erro in sin function for python - function

I have a question regarding using the sine function. When I entered the number 4, 8, .., etc., really I supposed to get the number somewhere very close to zero, but not exactly (Ex. 0.001, 0.0003, etc). However, I got the number y = 1.224 when x = 4, and y = -2.449 when x = 8. This should be incorrect. I don’t understand the problem here. Does anyone know what is going on here?
[Photo of my code and a sin graph - Link]
https://ibb.co/6YCWW90
[Code]
import math
import matplotlib.pyplot as plt
import numpy as np
x = [0, 1, 2, 3, 4, 5, 6, 7, 8]
y = [math.sin(0.25 * math.pi * i) for i in x]
print(y)
plt.plot(x, y)
plt.show()

Everything works.
When i is 0, then "0.25 * math.pi * i" is precisely 0 and when you calculate the sine, you get exactly 0.0.
When i is 4, then calculating "0.25 * math.pi * i" results in a number very close to PI, but the accuracy is limited. If you calculate the sine, you get a number which is very, very close to zero, but because of limited accuracy, not exactly zero. The result is 1.2246467991473532e-16. NOTE: it is 0.00000000000000012246467991473532, not 1.224 what your wrote in your question.
Similarly rounding errors result in -2.4492935982947064e-16 for i equal to 8. The argument is not exactly 2 PI and rounding errors result in a value slightly different than 0.0.
Again -2.4492935982947064e-16 is -0.00000000000000024492935982947064 and not -2.449 as you wrote in your question.

Related

Area of the quarter circle with SymPy Integrate?

I would like to evaluate the following integral using SymPy:
from sympy import *
x = symbols('x')
a = symbols('a', positive=True)
expr = sqrt(a**2 - x**2)
integrate(expr, (x, 0, pi/2))
What I would expect as an outcome is the area of the quarter circle (i.e., a^2*pi/4). Unfortunately, SymPy does not provide this result. When considering
integrate(expr, x)
I obtain the correct indefinite integral but when adding the limits it does not work.
Any ideas what I am doing wrong?
Thanks and all best,
VK88
The limit should be a if that is the radius and you are working in Cartesian coordinates:
from sympy import *
x = symbols('x')
a = symbols('a', positive=True)
expr = sqrt(a**2 - x**2)
integrate(expr, (x, 0, a))
That gives:
2
π⋅a
────
4

How to calculate Batch Pairwise Distance in PyTorch efficiently

I have tensors X of shape BxNxD and Y of shape BxNxD.
I want to compute the pairwise distances for each element in the batch, i.e. I a BxMxN tensor.
How do I do this?
There is some discussion on this topic here: https://github.com/pytorch/pytorch/issues/9406, but I don't understand it as there are many implementation details while no actual solution is highlighted.
A naive approach would be to use the answer for non-batched pairwise distances as discussed here: https://discuss.pytorch.org/t/efficient-distance-matrix-computation/9065, i.e.
import torch
import numpy as np
B = 32
N = 128
M = 256
D = 3
X = torch.from_numpy(np.random.normal(size=(B, N, D)))
Y = torch.from_numpy(np.random.normal(size=(B, M, D)))
def pairwise_distances(x, y=None):
x_norm = (x**2).sum(1).view(-1, 1)
if y is not None:
y_t = torch.transpose(y, 0, 1)
y_norm = (y**2).sum(1).view(1, -1)
else:
y_t = torch.transpose(x, 0, 1)
y_norm = x_norm.view(1, -1)
dist = x_norm + y_norm - 2.0 * torch.mm(x, y_t)
return torch.clamp(dist, 0.0, np.inf)
out = []
for b in range(B):
out.append(pairwise_distances(X[b], Y[b]))
print(torch.stack(out).shape)
How can I do this without looping over B?
Thanks
I had a similar issue and spent some time to find the easiest and fastest solution. Now you can compute batched distance by using PyTorch cdist which will give you BxMxN tensor:
torch.cdist(Y, X)
Also, it works well if you just want to compute distances between each pair of rows of two matrixes.

Using linear approximation to perform addition and subtraction | error barrier

I'm attempting my first solo project, after taking an introductory course to machine learning, where I'm trying to use linear approximation to predict the outcome of addition/subtraction of two numbers.
I have 3 features: first number, subtraction/addition (0 or 1), and second number.
So my input looks something like this:
3 0 1
4 1 2
3 0 3
With corresponding output like this:
2
6
0
I have (I think) successfully implemented logistic regression algorithm, as the squared error does gradually decrease, but in 100 values, ranging from 0 to 50, the squared error value flattens out at around 685.6 after about 400 iterations.
Graph: Squared Error vs Iterations
.
To fix this, I have tried using a larger dataset for training, getting rid of regularization, and normalizing the input values.
I know that one of the steps to fix high bias is to add complexity to the approximation, but I want to maximize the performance at this particular level. Is it possible to go any further on this level?
My linear approximation code in Octave:
% Iterate
for i = 1 : iter
% hypothesis
h = X * Theta;
% reg theta prep
regTheta = Theta;
regTheta(:, 1) = 0;
% cost calc
J(i, 2) = (1 / (2 * m)) * (sum((h - y) .^ 2) + lambda * sum(sum(regTheta .^ 2,1),2));
% theta calc
Theta = Theta - (alpha / m) * ((h - y)' * X)' + lambda * sum(sum(regTheta, 1), 2);
end
Note: I'm using 0 for lambda, as to ignore regularization.

Compare two linear regression models in MATLAB

I want to compare the performance of two models using the F statistic. Here is a reproducible example and the expected results:
load carbig
tbl = table(Acceleration,Cylinders,Horsepower,MPG);
% Testing separetly both models
mdl1 = fitlm(tbl,'MPG~1+Acceleration+Cylinders+Horsepower');
mdl2 = fitlm(tbl,'MPG~1+Acceleration');
% Comparing both models using the F-test and p-value
numerator = (mdl2.SSE-mdl1.SSE)/(mdl1.NumCoefficients-mdl2.NumCoefficients);
denominator = mdl1.SSE/mdl1.DFE;
F = numerator/denominator;
p = 1-fcdf(F,mdl1.NumCoefficients-mdl2.NumCoefficients,mdl1.DFE);
We end up with F = 298.75 and p = 0, indicating mdl1 is significantly better than mdl2, as assessed by the F statistic.
Is there anyway to obtain the F and p values without performing twice fitlm and doing all the computation?
I tried to run a coefTest, as suggested by #Glen_b, however the function is poorly documented and the results are not the ones I'm expecting.
[p,F] = coefTest(mdl1); % p = 0, F = 262.508 (this F test mdl1 vs constant mdl)
[p,F] = coefTest(mdl1,[0,0,1,1]); % p = 0, F = 57.662 (not sure what this is testing)
[p,F] = coefTest(mdl1,[1,1,0,0]); % p = 0, F = 486.810 (idem)
I believe I should carry the test with a different null hypothesis (C) using the function [p,F] = coeffTest(mdl1,H,C). But I don't really know how to do it and there's no example.
This answer is in regards to comparing two linear regression models where one model is a restricted version of the other.
Short answer:
To do an F-test on the restriction that the 3rd and 4th elements of your estimated, coefficient vector b are zero:
[p, F] = coefTest(mdl1, [0, 0, 1, 0; 0, 0, 0, 1]);
Further explanation:
Let b be our estimated vector. Linear restrictions on b are typically written in a matrix form: R*b = r. The restriction that 3rd and 4th element of b are zero would be written:
[0, 0, 1, 0 * b = [0
0, 0, 0, 1] 0];
The matrix [0, 0, 1, 0; 0, 0, 0, 1] is what coefTest calls the H matrix in the docs.
P = coefTest(M,H), with H a numeric matrix having one column for each
coefficient, performs an F test that H*B=0, where B represents the
coefficient vector.
Long version
Sometimes with this econometric routines, it's nice just to write it out yourself so you know what's really going on.
Remove rows with NaN because they just add unrelated complexity:
tbl_dirty = table(Acceleration,Cylinders,Horsepower,MPG);
tbl = tbl_dirty(~any(ismissing(tbl_dirty),2),:);
Do the estimation etc...
n = height(tbl); % number of observations
y = tbl.MPG;
X = [ones(n, 1), tbl.Acceleration, tbl.Cylinders, tbl.Horsepower];
k = size(X,2); % number of variables (including constant)
b = X \ y; % estimate b with least squares
u = y - X * b; % calculates residuals
s2 = u' * u / (n - k); % estimate variance of error term (assuming homoskedasticity, independent observations)
BCOV = inv(X'*X) * s2; % get covariance matrix of b assuming homoskedasticity of error term etc...
bse = diag(BCOV).^.5; % standard errors
R = [0, 0, 1, 0;
0, 0, 0, 1];
r = [0; 0]; % Testing restriction: R * b = r
num_restrictions = size(R, 1);
F = (R*b - r)'*inv(R * BCOV * R')*(R*b - r) / num_restrictions; % F-stat (see Hiyashi for reference)
Fp = 1 - fcdf(F, num_restrictions, n - k); % F p-val
For reference, can look at p. 65 of Hiyashi's book Econometrics.
No, there is not.
Fitlm fits an arbitrary model. In your case a regression model with an intercept and either one or three regressors. It might seem that the model with three regressors can use information from the model with one regressor, but this is only true if there are some restrictions on the model and even then this overlapping information is limited.
Fitlm is a very general framework which can be used for arbitrary models. Doing multiple regressions at the same time with sharing of information can thus get quite complex and is not implemented.
It is possible to implement this yourself for these two specific models. Usually such a linear regression is solved using the covariance matrix:
Beta = (X' X) ^-1 X' y
were X is the data with the variables as columns and y is the target variable. In this case you could reuse part of the covariance matrix for which you only need the columns from the smaller regression: the variation in Acceleration. Since adding 2 new variables adds 8 values yo the covariance matrix you only save 1/9 of the time. Furthermore, the heaviest part is the inversion. Thus the time improvement is very very little.
In short, just do two separate regressions

Matlab plotting the shifted logistic function

I would like to plot the shifted logistic function as shown from Wolfram Alpha.
In particular, I would like the function to be of the form
y = exp(x - t) / (1 + exp(x - t))
where t > 0. In the link, for example, t is 6. I had originally tried the following:
x = 0:.1:12;
y = exp(x - 6) ./ (1 + exp(x - 6));
plot(x, y);
axis([0 6 0 1])
However, this is not the same as the result from Wolfram Alpha. Here is an export of my plot.
I do not understand what the difference is between what I am trying to do here vs. plotting shifted sin and cosine functions (which works using the same technique).
I am not completely new to Matlab but I do not usually use it in this way.
Edit: My values for x in the code should have been from 0 to 12.
fplot takes as inputs a function handle and a range to plot for:
>> fplot(#(x) exp(x-6) / (1 + exp(x-6)), [0 12])
The beauty of fplot in this case is you don't need to spend time calculating y-values beforehand; you could also extract values from the graph after the fact if you want (by getting the line handle's XData and YData properties).
Your input to Wolfram Alpha is incorrect. It is interpreted as e*(x-6)/(1-e*(x-6)). Use plot y = exp(x - 6) / (1 + exp(x - 6)) for x from 0 to 12 in Wolfram Alpha (see here) for the same results as in MATLAB. Also use axis([0 12 0 1]) (or no axis statement at all on a new plot) to see the full results in MATLAB.
In reply to your comment: use y = exp(1)*(x - 6) ./ (1 + exp(1)*(x - 6)); to do in MATLAB what you were doing in Wolfram Alpha.