Tensor(1.0).item() vs float(Tensor(1.0)) - deep-learning

If x is a torch.Tensor of dtype torch.float then are the operations x.item() and float(x) exactly the same?

The operations x.item() and float(x) are not the same.
From the documentation of item(), it can be used to get the value of tensor as a Python number(only from tensors containing a single value). It basically returns the value of the tensor as it is. It does not make any modifications to the tensor.
Where as float() is for converting its input to a floating point number, when possible. Find the documentation here.
To see the difference, consider another Tensor y of dtype int64:
import torch
y = torch.tensor(2)
print(y, y.dtype)
>>> tensor(2) torch.int64
print('y.item(): {}, float(y): {}'.format(y.item(), float(y)))
>>> y.item(): 2, float(y): 2.0
print(type(y.item()), type(float(y)))
>>> <class 'int'> <class 'float'>
Note that float(y) does not convert the type in-place. You would need to assign it in case you need that change. Like:
z = float(y)
print('y.dtype: {}, type(z): {}'.format(y.dtype, type(z)))
>>> y.dtype: torch.int64, type(z): <class 'float'>
We can see that z is not a torch.Tensor. It is simply a floating point number.
The float() operation is not to be confused with self.float(). This operation performs the Tensor dtype conversion (not in-place, needs assignment).
print('y.float(): {},\n y.float().dtype: {},\n y: {},\n y.dtype'.format(y.float(), y.float().dtype, y, y.dtype))
y.float(): 2.0,
y.float().dtype: torch.float32,
y: 2,
y.dtype: torch.int64

Related

How to use PyTorch nn.BatchNorm1d to get equal normalization across features?

i would like to ask a question regarding the nn.BatchNorm1d in PyTorch.
I have one main tensor, which has shape [B, 3, N]. Then, i have two additional tensors which have shape [B, 3, V1] and [B, 3, V2]. I will concatenate the main tensor with the two tensors separately, to construct new tensors [B, 3, N+V1] and [B, 3, N+V2].
I pass my tensors to a plain MLP (consists of conv1d and batchnorm1d). Ideally, i want to predict something "point-wise", like no matter what the number of dimension 2, it has some consistent prediction only given the value. However, the batchnorm1d will have different results given input [B, 3, N+V1] and [B, 3, N+V2], while i am only focusing on first N points in 2nd dimension.
import torch
import torch.nn as nn
# nn.BatchNorm1d
B=2
dim=64
N=40000
V1=1000
v2=2000
torch.manual_seed(0)
x = torch.rand(B, dim, N) # here imgs are flattened from 28x28
v1 = torch.rand(B, dim, V1)
v2 = torch.rand(B, dim, v2)
layer = nn.BatchNorm1d(dim) # batch norm is done on channels
out2 = layer(torch.cat((x, v1), dim=2))
out3 = layer(torch.cat((x, v2), dim=2))
torch.equal(out2[:, :, :N], out3[:, :, :N])
Is there any possible way to have consistent prediction of first N points?
Is this more along the lines of what you're looking for? Normalizing just across the channels?
out2 = torch.cat((x, v1), dim=2) / torch.linalg.norm(torch.cat((x, v1), dim=2), dim=1, keepdim=True)
out3 = torch.cat((x, v2), dim=2) / torch.linalg.norm(torch.cat((x, v2), dim=2), dim=1, keepdim=True)
torch.equal(out2[:, :, :N], out3[:, :, :N])
# True
I think if you want to do something like this within pytorch nn libraries you'll need to transpose your channels and feature dimensions that way you can use LayerNorm1d or InstanceNorm. See here for a nice visual example of the different normalization techniques
Update answer:
In case you want to use an nn module specifically. InstanceNorm or GroupNorm could also get you the response. However the number of channels now differs between the two so you'll need two distinct layers.
layer1 = nn.GroupNorm(V1+N, V1+N)
layer2 = nn.GroupNorm(V2+N, V2+N)
out2 = layer1(torch.cat((x, v1), dim=2).transpose(1,2))
out3 = layer2(torch.cat((x, v2), dim=2).transpose(1,2))
torch.equal(out2[:, :N, :], out3[:, :N, :])
True

How can I use scipy interp1d with N-D array for x without for loop

How can I use scipy.interpolate.interp1d when my x array is an N-D array, instead of a 1-D array, without using a loop?
The function f from interp1d then needs to be used with numpy.percentile with one of the arrays as an input.
I think there should be a way to do it with a list comprehension or lambda function, but I am still learning these tools.
(Note that this is different than my recent question here because I mixed up the x and y arrays in the posted question, and this problem was not reproducible.)
Problem statement/example:
# a is y in interp1d docs
a = np.array([97,4809,4762,282,3879,17454,103,2376,40581,])
# b is x in interp1d docs
b = np.array([
[0.14,0.11,0.29,0.11,0.09,0.68,0.09,0.18,0.5,],
[0.32,0.25,0.67,0.25,0.21,1.56,1.60, 0.41,1.15,],]
)
Just trying this, below, fails with ValueError: x and y arrays must be equal in length along interpolation axis. The expected return is array(97, 2376). Using median here, but will need to consider 10th, 90th, etc. percentiles.
f = interpolate.interp1d(b, a, axis=0)
f(np.percentile(b, 50, axis=0))
However this, below, works and prints array(97.)
f = interpolate.interp1d(b[0,:], a, axis=0)
f(np.percentile(b[0,:], 50, axis=0))
A loop works, but I am wondering if there is a solution using list comprehensions, lambda functions, or some other technique.
l = []
for _i in range(b.shape[0]):
_f = interpolate.interp1d(b[_i,:], a, axis=0)
l.append(_f(np.percentile(b[_i,:], 50, axis=0)))
print(out)
# returns
# [array(97.), array(2376.)]
Efforts:
I understand I can loop through the b array with a list comprehension.
[b[i,:] for i in range(b.shape[0])]
# returns
# [array([0.14, 0.11, 0.29, 0.11, 0.09, 0.68, 0.09, 0.18, 0.5 ]),
# array([0.32, 0.25, 0.67, 0.25, 0.21, 1.56, 1.6 , 0.41, 1.15])]
And I also understand that I can use a list comprehension to create the scipy function f for each dimension in b:
[interpolate.interp1d(b[i, :], a, axis=0) for i in range(b.shape[0])]
# returns
# [<scipy.interpolate.interpolate.interp1d at 0x1b72e404360>,
# <scipy.interpolate.interpolate.interp1d at 0x1b72e404900>]
But I don't know how to combine these two list comprehensions to apply the np.percentile function.
Using Python 3.8.3, NumPy 1.18.5, SciPy 1.3.2
If you have large data arrays, you want to stay away from for loops, map, np.vectorize and comprehensions. They will all be slow. Instead, it's always better to use vectorized numpy or scipy operations whenever possible.
In this particular case, you can implement the vectorization pretty trivially yourself. interp1d defaults to a linear interpolation, which is very simple to code by hand. For a general interpolator, the first step would be to sort x and y, which is why scipy can't support multiple x for a given y. If the x rows all have different sort order, what do you do with the y?
Luckily, there are a couple of things you can do to make this much faster than having to build a full interpolator or argsort y multiple times. For example, start by argsorting x:
idx = b.argsort(axis=1)
idx is now an array such that b[np.arange(2)[:, None], idx] gives the sorted version of b along axis 1, and also, a[idx] is the corresponding y-values. Since you are taking the median (50th precentile), and the rows have an odd number of elements, the value of x is just the middle of each row, and y is given by
a[idx[:, len(a) // 2]]
If you had an even number of elements, you would have to average the elements surrounding the middle:
i = len(a) // 2 - 1
a[idx[:, i:i + 2]].mean(axis=1)
You can reduce algorithmic complexity by using np.argpartition instead of a full-blown np.argsort to get the middle element(s).
interp1d and other interpolators from scipy.interpolate only support 1D x arrays. So you'll need to loop over the dimensions of x manually.

How to specify the axis when using the softmax activation in a Keras layer?

The Keras docs for the softmax Activation states that I can specify which axis the activation is applied to. My model is supposed to output an n by k matrix M where Mij is the probability that the ith letter is symbol j.
n = 7 # number of symbols in the ouput string (fixed)
k = len("0123456789") # the number of possible symbols
model = models.Sequential()
model.add(layers.Dense(16, activation='relu', input_shape=((N,))))
...
model.add(layers.Dense(n * k, activation=None))
model.add(layers.Reshape((n, k)))
model.add(layers.Dense(output_dim=n, activation='softmax(x, axis=1)'))
The last line of code doesn't compile as I don't know how to correctly specify the axis (the axis for k in my case) for the softmax activation.
You must use an actual function there, not a string.
Keras allows you to use a few strings for convenience.
The activation functions can be found in keras.activations, and they're listed in the help file.
from keras.activations import softmax
def softMaxAxis1(x):
return softmax(x,axis=1)
.....
......
model.add(layers.Dense(output_dim=n, activation=softMaxAxis1))
Or even a custom axis:
def softMaxAxis(axis):
def soft(x):
return softmax(x,axis=axis)
return soft
...
model.add(layers.Dense(output_dim=n, activation=softMaxAxis(1)))

Wrong result of sympy integration

This expression returns zero, but it shouldn`t.
P = x^6-14x^4+49x^2-36
integrate(1/P, (x, 1/3, 1/2))
I also used expand on expression, without any result.
Am i doing something wrong or is this a bug?
This works:
from sympy import *
x = symbols('x')
P = x**6-14*x**4+49*x**2-36
I = integrate(1/expand(P), (x, S.One/3, S.One/2))
I get the result:
In [5]: I
Out[5]: -3*log(3)/80 - log(7)/48 - log(2)/48 - log(8)/240 + log(10)/240 + log(4)/48 + 3*log(5)/80
In [6]: I.n()
Out[6]: -0.00601350282195297
In alternative, you could run the command isympy -i, this will run a SymPy prompt that converts all Python integers to SymPy integers before the input gets evaluated by the SymPy parser.
Python integer division is different between Python 2 and Python 3, the first returns and integer, the second returns a floating point number. Both versions are different to SymPy integer division, which returns fractions. To use SymPy division, you need to make sure that at least one among the dividend and divisor are SymPy objects.

Plotting a 3D function with Octave

I am having a problem graphing a 3d function - when I enter data, I get a linear graph and the values don't add up if I perform the calculations by hand. I believe the problem is related to using matrices.
INITIAL_VALUE=999999;
INTEREST_RATE=0.1;
MONTHLY_INTEREST_RATE=INTEREST_RATE/12;
# ranges
down_payment=0.2*INITIAL_VALUE:0.1*INITIAL_VALUE:INITIAL_VALUE;
term=180:22.5:360;
[down_paymentn, termn] = meshgrid(down_payment, term);
# functions
principal=INITIAL_VALUE - down_payment;
figure(1);
plot(principal);
grid;
title("Principal (down payment)");
xlabel("down payment $");
ylabel("principal $ (amount borrowed)");
monthly_payment = (MONTHLY_INTEREST_RATE*(INITIAL_VALUE - down_paymentn))/(1 - (1 + MONTHLY_INTEREST_RATE)^-termn);
figure(2);
mesh(down_paymentn, termn, monthly_payment);
title("monthly payment (principal(down payment)) / term months");
xlabel("principal");
ylabel("term (months)");
zlabel("monthly payment");
The 2nd figure like I said doesn't plot like I expect. How can I change my formula for it to render properly?
I tried your script, and got the following error:
error: octave_base_value::array_value(): wrong type argument `complex matrix'
...
Your monthly_payment is a complex matrix (and it shouldn't be).
I guess the problem is the power operator ^. You should be using .^ for element-by-element operations.
From the documentation:
x ^ y
x ** y
Power operator. If x and y are both scalars, this operator returns x raised to the power y. If x is a scalar and y is a square matrix, the result is computed using an eigenvalue expansion. If x is a square matrix. the result is computed by repeated multiplication if y is an integer, and by an eigenvalue expansion if y is not an integer. An error results if both x and y are matrices.
The implementation of this operator needs to be improved.
x .^ y
x .** y
Element by element power operator. If both operands are matrices, the number of rows and columns must both agree.