How to prevent the initial pytorch variable from changing using a function? - function

I want to apply a function to the variable x and saved as y. But why the x is also changed? How to prevent it?
import torch
def minus_min(raw):
for col_i in range(len(raw[0])):
new=raw
new[:,col_i] = (raw[:,col_i] - raw[:,col_i].min())
return new
x=torch.tensor([[0,1,2,3,4],
[2,3,4,0,8],
[0,1,2,3,4]])
y=minus_min(x)
print(y)
print(x)
output:
tensor([[0, 0, 0, 3, 0],
[2, 2, 2, 0, 4],
[0, 0, 0, 3, 0]])
tensor([[0, 0, 0, 3, 0],
[2, 2, 2, 0, 4],
[0, 0, 0, 3, 0]])

Because this assignment:
new[:,col_i] = (raw[:,col_i] - raw[:,col_i].min())
is an in-place operation. Therefore, x and y will share the underlying .data.
The smallest change that would solve this issue would be to make a copy of x inside the function:
def minus_min(raw):
new = raw.clone() # <--- here
for col_i in range(len(raw[0])):
new[:,col_i] = raw[:,col_i] - raw[:,col_i].min()
return new
If you want, you can simplify your function (and remove the for loop):
y = x - x.min(dim=0).values

Related

Python Particle Filter: Time Series in NFOURSID Input Error

Documentation:
https://nfoursid.readthedocs.io/en/latest/
#housekeeping
#_________________________________________________________________________
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from nfoursid.kalman import Kalman
from nfoursid.nfoursid import NFourSID
from nfoursid.state_space import StateSpace
import time
import datetime
import math
import scipy as sp
from pandas_datareader import data as pdr
from IPython.display import display, Latex
from statsmodels.graphics.tsaplots import plot_acf
import yfinance as yfin
#this Time Series should be used as input
#_________________________________________________________________________
import yfinance as yfin
yfin.pdr_override()
spy = pdr.get_data_yahoo('AAPL',start='2022-08-23',end='2022-10-24')
spy['Log Return'] = np.log(spy['Adj Close']/spy['Adj Close'].shift(1))
AAPL=pd.DataFrame((spy['Log Return']))
#this is from the documentation and actually works
#_________________________________________________________________________
pd.set_option('display.max_columns', None)
# reproducable results
np.random.seed(0)
# create a training-set by simulating a state-space model with this many datapoints
NUM_TRAINING_DATAPOINTS = 1000
# same for the test-set
NUM_TEST_DATAPOINTS = 20
INPUT_DIM = 3
OUTPUT_DIM = 2
# actual order of the state-space model in the training- and test-set
INTERNAL_STATE_DIM = 4
NOISE_AMPLITUDE = .1 # add noise to the training- and test-set
FIGSIZE = 8
# define system matrices for the state-space model of the training-
# and test-set
A = np.array([
[1, .01, 0, 0],
[0, 1, .01, 0],
[0, 0, 1, .02],
[0, -.01, 0, 1],
]) / 1.01
B = np.array([
[1, 0, 0],
[0, 1, 0],
[0, 0, 1],
[0, 1, 1],
]
) / 3
C = np.array([
[1, 0, 1, 1],
[0, 0, 1, -1],
])
D = np.array([
[1, 0, 1],
[0, 1, 0]
]) / 10
state_space = StateSpace(A, B, C, D)
for _ in range(NUM_TRAINING_DATAPOINTS):
input_state = np.random.standard_normal((INPUT_DIM, 1))
noise = np.random.standard_normal((OUTPUT_DIM, 1)) * NOISE_AMPLITUDE
state_space.step(input_state, noise)
nfoursid = NFourSID(
# the state-space model can summarize inputs and outputs as a dataframe
state_space.to_dataframe(),
output_columns=state_space.y_column_names,
input_columns=state_space.u_column_names,
num_block_rows=10
)
nfoursid.subspace_identification()
#further methods
#_________________________________________________________________________
fig, ax = plt.subplots(figsize=figsize)
nfoursid.plot_eigenvalues(ax)
fig.tight_layout()
#interpret order from plot (sprungstelle), still run order->inf
ORDER_OF_MODEL_TO_FIT = 4
state_space_identified, covariance_matrix = nfoursid.system_identification(
rank=ORDER_OF_MODEL_TO_FIT
)
#Ausgabe der Modellvorhersagen
nfoursid.to_dataframe()
#Vorhersage gegen Beobachtung
figsize = (1.3 * FIGSIZE, FIGSIZE)
fig = plt.figure(figsize=figsize)
# the state-space model can plot its inputs and outputs
state_space.plot_input_output(fig)
fig.tight_layout()
Pasting AAPL in method nfoursid:
TypeError: NFourSID.init() missing 1 required positional argument: 'dataframe'
Pasting AAPL in method state_space:
ValueError: Dimensions of u (43, 1) are inconsistent. Expected (3, 1). and TypeError: 'DataFrame' object is not callable

DolphinDB random sample functions

Can I sample an array a = [1, 2, 3, 4] based on the specified probabilities p = [0.1, 0.1, 0.3, 0.5]?
For example, in python I can use np.random.choice(a=[1, 2, 3, 4], size=100, p=[0.1, 0.1, 0.3, 0.5])
For me I will form a new random choice data list by percentages/probabilities, for example do random choice on [1, 2, 3, 3, 3, 4, 4, 4, 4, 4] will be equivalent with your data and probabilities.
You can use take (https://www.dolphindb.com/help/FunctionsandCommands/FunctionReferences/t/take.html) function for helping the data forming:
v = take(1, 1) <- take(2, 1) <- take(3, 3) <- take(4, 5)
rand(v, 100)

How can I concatenate pytorch tensors or lists in a distributed multi-node setup?

I am trying to implement something like this for 2 nodes (each node with 2 GPUs):
#### Parallel process initiated with torch.distributed.init_process_group()
### All GPUs work in parallel, and generate lists like :
[20, 0, 1, 17] for GPU0 of node A
[1, 2, 3, 4] for GPU1 of node A
[5, 6, 7, 8] for GPU0 of node B
[0, 2, 4, 6] for GPU1 of node B
I tried
torch.distributed.reduce()
to get a sum of these 4:
[26, 10, 15, 35]
But what I want is a concatenated version like this
[[20, 0, 1, 17], [1, 2, 3, 4] , [5, 6, 7, 8] , [0, 2, 4, 6]]
Or
[20, 0, 1, 17, 1, 2, 3, 4, 5, 6, 7, 8, 0, 2, 4, 6]
is also OK with me.
Is it possible to achieve this from torch.distributed?
You can use dist.all_gather to do this:
import torch
import torch.distributed as dist
q = torch.tensor([20, 0, 1, 17]) # generated on each gpu (with different values) as you mentioned
all_q = [torch.zeros_like(q) for _ in range(world_size)] # world_size is the total number of gpu processes you are running. 4 in your case.
all_q = dist.all_gather(all_q, q)
all_q would then have the following:
[torch.tensor([20, 0, 1, 17]), torch.tensor([1, 2, 3, 4]), torch.tensor([5, 6, 7, 8]), torch.tensor([0, 2, 4, 6])]
You can then use torch.cat to collapse all elements into one array if you like.
You can use dist.all_gather_multigpu if you list of lists of tensors.

How do I make a function that will allow me to store values in a list?

I need to create a python function, which will recursively store fibonacci values in a list, and then return that list to me. Then, I can print that list. Here is what I have.
def recFib(x):
result = []
if x == 1:
return result.append(1)
if x == 2:
return result.append(2)
for i in range(2,x):
result.append(recFib(i-1)+recFib(i-2))
return result
I am new to Python, so alot of concepts are new to me and I seem to be unable to figure out what I'm doing wrong.
Here is a solution based on the #Akavall's anwer
def recFib(x):
fibArray = [0, 1]
def fib(x):
if x < len(fibArray):
return fibArray[x]
temp = fib(x - 1) + fib(x - 2)
fibArray.append(temp)
return temp
return fib(x)
There a few issues with you code, for example:
return result.append(1)
will return None, not result with 1 appended to it.
Also,
def recFib(x):
result = []
Will set result to [] every time recFib is called, I don't think that's what you wanted.
And I am not sure that I understand your entire logic, sorry.
A possible solution is something like this:
def recFib(x):
n_to_fib = {0: 0, 1: 1}
def fib(x):
if x in n_to_fib:
return n_to_fib[x]
temp = fib(x - 1) + fib(x - 2)
n_to_fib[x] = temp
return temp
return fib(x)
if __name__ == "__main__":
result = [recFib(i) for i in range(8)]
print result
Result:
[0, 1, 1, 2, 3, 5, 8, 13]
And calculating new values is fast because the previous values are stored in n_to_fib dict and thus don't have to be recomputed.
We can also modify recFib to return the list directly:
fib(x)
return sorted(n_to_fib.values())
instead of:
return fib(x)
Try this:
def recFib(n):
if n == 1:
return [0]
elif n == 2:
return [0, 1]
else:
a = recFib(n-1)
return a + [a[-1] + a[-2]]
The key in the program is here:
return a + [a[-1] + b[-1]]
a # This gets the most recently created list, since it's `recFib(n-1)`
+ # Append to the next list (but does not return `None`)
[ a[-1] # Last value of `recFib(n-1)`, which is the previous value.
+ # Add to
a[-2]] # Second last value of `recFib(n-1)`, which is the second previous value.
Here are the breakpoints for recFib(5):
n = 3
a[-1]: 1
a[-2]: 0
n = 4
a[-1]: 1
a[-2]: 1
n = 5
a[-1]: 2
a[-2]: 1
[0, 1, 1, 2, 3]
It doesn't require two functions. I sort of find that as cheating:
>>> recFib(20)
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
>>> recFib(10)
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
>>> recFib(19)
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584]
Generating a list by for looping a recursive function is very inefficient. You are recalculating the whole sequence every time you use the function.

How to convert an upper/lower gpuarray to the specific format required by cublasStbsv?

I am currently using pycuda and scikits.cuda to solve linear equation A*x = b, where A is an upper/lower matrix. However the cublasStbsv routine requires a specific format.
To give an example: if a lower matrix A = [[1, 0, 0], [2, 3, 0], [4, 5, 6]], then the input required by cublasStbsv should be [[1, 3, 6], [2, 5, 0], [4, 0, 0]], where rows are diagonal, subdiagonal1, subdiagonal2, respectively. If using numpy, this can be easily done by stride_tricks.as_strided, but I dont know how to do similar things with pycuda.gpuarray. Any help would be appreciated, thanks. I found pycuda.compyte.array.as_strided, but it cannot be applied to gpuarray.
I got it done by using theano. First converted it to cudandarray, change stride and make a copy back to gpuarray. Just be careful about changes between Fortran and C order.
update:
finally got it done by using gpuarray.multi_take_put
def make_triangle(s_matrix, uplo = 'L'):
"""convert triangle matrix to the specific format
required by cublasStbsv, matrix should be in Fortran order,
s_matrix: gpuarray
"""
#make sure the dytpe is float32
if s_matrix.dtype != 'f':
s_matrix = s_matrix.astype('f')
dim = s_matrix.shape[0]
if uplo == 'L':
idx_tuple = np.tril_indices(dim)
gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
gdst = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * (dim - 1))
return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]
else:
idx_tuple = np.triu_indices(dim)
gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
gdst = gpuarray.to_gpu(idx_tuple[0] + (idx_tuple[1] + 1) * (dim - 1))
return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]