scipy.fft.ifftn of complex pyopecl.array - fft

I'm trying to add two 3D complex arrays on gpu using pyopencl and then perform inverse fast Fourier transform of it result. But I have a mistake, which I do not really understand. Any advices on increasing code performance would be great.
import pyopencl as cl
import numpy as np
import os
from scipy.fftpack import fftn, ifftn
import pyopencl.array as cl_array
from pyopencl.elementwise import ElementwiseKernel
os.environ['PYOPENCL_COMPILER_OUTPUT'] = '1'
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
Lx = 50
Ly = 50
Lz = 1
M1f = np.ones((2 * Lx - 1, 2 * Ly - 1, 2 * Lz - 1)).astype(np.float32)
M2f = np.ones((2 * Lx - 1, 2 * Ly - 1, 2 * Lz - 1)).astype(np.float32)
FM1 = fftn(M1f)
FM2 = fftn(M2f)
res_cpu = FM1 + FM2
print(ifftn(res_cpu))
FM1_gpu = cl_array.to_device(queue,np.reshape(FM1, (2*Lx-1)*(2*Ly- 1)*(2*Lz-1)).astype(np.complex64))
FM2_gpu = cl_array.to_device(queue,np.reshape(FM2, (2*Lx-1)*(2*Ly-1)*(2*Lz-1)).astype(np.complex64))
complex_add = ElementwiseKernel(ctx,
"float *x, "
"float *y, "
"float *z",
"z[i] = x[i] + y[i]",
"complex_add")
add_gpu = cl_array.empty_like(FM1_gpu)
complex_add(FM1_gpu, FM2_gpu, add_gpu)
res_gpu = np.zeros((2*Lx-1, 2*Ly-1, 2*Lz-1)).astype(np.complex64)
res_gpu = np.reshape(add_gpu, (2*Lx-1, 2*Ly-1, 2*Lz-1))
print(ifftn(res_gpu))
I expect to have a true value of ifft of 2 complex arrays, which were add in gpu, but instead of it i get the result:
Traceback (most recent call last): File
"/home/heisenberg/Desktop/НИР/FM/math/GPU/loopsum.py", line 43, in
print(ifftn(res_gpu)) File "/home/heisenberg/.local/lib/python3.7/site-packages/scipy/fftpack/basic.py",
line 670, in ifftn
return _raw_fftn_dispatch(x, shape, axes, overwrite_x, -1) File "/home/heisenberg/.local/lib/python3.7/site-packages/scipy/fftpack/basic.py",
line 628, in _raw_fftn_dispatch
tmp = _asfarray(x) File "/home/heisenberg/.local/lib/python3.7/site-packages/scipy/fftpack/basic.py",
line 136, in _asfarray
return numpy.asarray(x, dtype=x.dtype) File "/home/heisenberg/.local/lib/python3.7/site-packages/numpy/core/numeric.py",
line 538, in asarray
return array(a, dtype, copy=False, order=order) TypeError: must be real number, not Array
Process finished with exit code 1

Related

What is wrong with my neural net model with LSTM for regression problem that it doesn't return the model as output?

So, the questionn is this:
What I am doing wrong when defining the neural net architecture? Look at sections Define the neural network model and Define the learning rate scheduler train the model
Details:
I have written the code of this where revenue_data shape is (1749, 2) while weather_data shape is (86990, 10) X_train shape is ([69010, 14]), y_train is ([69010]), X_val is ([17253, 14]), y_val = ([17253]) and have done the preprocesing, scaling, removing oputliers and splitting the data as here:
Convert date and time columns to datetime format
revenue_data['Date'] = pd.to_datetime(revenue_data['Date'], format='%Y%m%d')
weather_data['dt'] = pd.to_datetime(weather_data['dt'], format='%Y%m%d')
weather_data['time'] = pd.to_datetime(weather_data['time'], format='%H:%M:%S')
Convert wind and condition columns to embeddings
wind_embeddings = nn.Embedding(len(weather_data['wind'].unique()), 5)
weather_data['wind_code'] = weather_data['wind'].astype('category').cat.codes
wind_vectors = wind_embeddings(torch.tensor(weather_data['wind_code'].values, dtype=torch.long))
weather_data['wind_x'] = wind_vectors[:, 0].detach().numpy()
weather_data['wind_y'] = wind_vectors[:, 1].detach().numpy()
weather_data['wind_z'] = wind_vectors[:, 2].detach().numpy()
weather_data['wind_t'] = wind_vectors[:, 3].detach().numpy()
weather_data['wind_u'] = wind_vectors[:, 4].detach().numpy()
condition_embeddings = nn.Embedding(len(weather_data['condition'].unique()), 3)
weather_data['condition_code'] = weather_data['condition'].astype('category').cat.codes
condition_vectors = condition_embeddings(torch.tensor(weather_data['condition_code'].values, dtype=torch.long))
weather_data['condition_x'] = condition_vectors[:, 0].detach().numpy()
weather_data['condition_y'] = condition_vectors[:, 1].detach().numpy()
weather_data['condition_z'] = condition_vectors[:, 2].detach().numpy()
Group the weather data by date and hour and calculate the mean for each date and hour
weather_data = weather_data.groupby(['dt', 'time']).mean()
weather_data = weather_data.reset_index()
weather_data['Date'] = weather_data['dt']
weather_data.drop(['dt', 'time', 'wind_code', 'condition_code'], axis=1, inplace=True)
Merge the revenue and weather data on the 'Date' column and drop 'Date'
merged_data = pd.merge(revenue_data, weather_data, on='Date')
merged_data.drop('Date', axis=1, inplace=True)
merged_data.head()
Scale the data
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(merged_data)
Split the data into input and target sets
X = scaled_data[:, 1:]
y = scaled_data[:, 0]
from scipy.stats import zscore
Calculate z-scores for each feature | Remove outliers that have z-scor bigger that 3
z_scores = zscore(X)
Identify rows where any feature has a z-score > 3
mask = (z_scores > 3).any(axis=1)
Remove rows with high z-scores from the x and y
features = X[~mask, :]
target = y[~mask]
Split the data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
Convert the data to PyTorch tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)
X_val = torch.tensor(X_val, dtype=torch.float32)
y_val = torch.tensor(y_val, dtype=torch.float32)
but the I am struggling to realise what is wrong with the neural net architecture defined:
Define the neural network model
class RevenuePredictor(nn.Module):
def __init__(self):
super().__init__()
self.lstm = nn.LSTM(input_size=14, hidden_size=32, num_layers=1, batch_first=True)
self.fc1 = nn.Linear(32, 16)
self.fc2 = nn.Linear(16, 1)
def forward(self, x, lengths):
print('x shape:', x.shape)
Get the lengths of the input sequences
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
lengths = lengths.to(device)
lengths = lengths.cpu()
print('lengths shape:', lengths.shape)
Sort the input sequences by length
sorted_lengths, sorted_idx = torch.sort(lengths, descending=True)
sorted_x = x[sorted_idx]
Pack the sorted input sequences
packed_x = nn.utils.rnn.pack_padded_sequence(sorted_x, sorted_lengths, batch_first=True)
Convert the packed sequence to a tensor with two dimensions
x_data, batch_sizes = nn.utils.rnn.pad_packed_sequence(packed_x, batch_first=True)
Convert the packed sequence to a tensor with two dimensions
x_data, batch_sizes = x.data, x.batch_sizes
seq_len = batch_sizes[0]
batch_size = len(batch_sizes)
x = x_data.new_zeros((batch_size, seq_len, 14))
s = 0
for i, l in enumerate(batch_sizes):
x[i, :l] = x_data[s:(s+l)]
s += l
Pass the packed input sequences through the LSTM
lstm_output, (h, c) = self.lstm(packed_x)
Unpack the LSTM output sequences
unpacked_output, _ = nn.utils.rnn.pad_packed_sequence(lstm_output, batch_first=True)
Re-sort the output sequences to their original order
unsorted_idx = sorted_idx.sort(0)
output = unpacked_output[unsorted_idx]
Pass the output sequences through the fully connected layers
output = nn.functional.relu(self.fc1(output[:, -1, :]))
output = self.fc2(output)
return output
Then Create the model
model = RevenuePredictor()
followed by loss and metrics
loss_fn = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)
metrics = {
'mse': MeanSquaredError(),
'mae': MeanAbsoluteError(),
'r2': R2Score(),
}
Define the learning rate scheduler train the model
scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=10, verbose=True)
best_val_loss = np.inf
for epoch in range(num_epochs):
# Set the model to training mode
model.train()
train_loss = 0.0
num_batches = 0
for X_train, y_train in train_loader:
lengths = torch.ones(X_train.shape[0], dtype=torch.long)
optimizer.zero_grad()
output = model(X_train, lengths)
loss = loss_fn(output, y_train)
loss.backward()
optimizer.step()
train_loss += loss.item()
num_batches += 1
val_loss = 0.0
for X_val, y_val in val_loader:
lengths = torch.ones(X_val.shape[0], dtype=torch.long)
output = model(X_val, lengths)
loss = loss_fn(output, y_val)
val_loss += loss.item()
scheduler.step(val_loss)
val_loss /= len(val_loader)
val_mse = metrics['mse'].compute()
val_mae = metrics['mae'].compute()
val_r2 = metrics['r2'].compute()
for metric in metrics.values():
metric.reset()
if (epoch+1) % 100 == 0:
print('Epoch [{}/{}], Train Loss: {:.4f}, Val Loss: {:.4f}, MSE: {:.4f}, MAE: {:.4f}, R2: {:.4f}'
.format(epoch+1, num_epochs, train_loss/num_batches, val_loss, val_mse, val_mae, val_r2))
I get this error which I think is because of something being wwrong in defining neural netwrok model:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-164-e20b93c25048>", line 3, in <module>
output = model(X_train, lengths)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "<ipython-input-163-43b2ef5c15db>", line 38, in forward
lstm_output, (h, c) = self.lstm(packed_x)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/rnn.py", line 772, in forward
self.check_forward_args(input, hx, batch_sizes)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/rnn.py", line 697, in check_forward_args
self.check_input(input, batch_sizes)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/rnn.py", line 206, in check_input
raise RuntimeError(
# RuntimeError: input must have 2 dimensions, got 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2040, in showtraceback
stb = value._render_traceback_()
AttributeError: 'RuntimeError' object has no attribute '_render_traceback_'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/IPython/core/ultratb.py", line 1101, in get_records
return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
File "/usr/local/lib/python3.8/dist-packages/IPython/core/ultratb.py", line 319, in wrapped
return f(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/IPython/core/ultratb.py", line 353, in _fixed_getinnerframes
records = fix_frame_records_filenames(inspect.getinnerframes(etb, context))
File "/usr/lib/python3.8/inspect.py", line 1515, in getinnerframes
frameinfo = (tb.tb_frame,) + getframeinfo(tb, context)
File "/usr/lib/python3.8/inspect.py", line 1473, in getframeinfo
filename = getsourcefile(frame) or getfile(frame)
File "/usr/lib/python3.8/inspect.py", line 708, in getsourcefile
if getattr(getmodule(object, filename), '__loader__', None) is not None:
File "/usr/lib/python3.8/inspect.py", line 737, in getmodule
file = getabsfile(object, _filename)
File "/usr/lib/python3.8/inspect.py", line 721, in getabsfile
return os.path.normcase(os.path.abspath(_filename))
File "/usr/lib/python3.8/posixpath.py", line 379, in abspath
cwd = os.getcwd()
FileNotFoundError: [Errno 2] No such file or directory
---------------------------------------------------------------------------
I tried coverting the packed sequence to a tensor with two dimensions in different way:
x_data, ba
tch_sizes = x.data, x.batch_sizes
seq_len = batch_sizes[0]
batch_size = len(batch_sizes)
x = x_data.new_zeros((batch_size, seq_len, 14))
s = 0
for i, l in enumerate(batch_sizes):
x[i, :l] = x_data[s:(s+l)]
s += l
Didnt work.
Then tried rehsaping x to have three dimensions like:
batch_size, seq_len, input_size = x.shape
Didn't work and finally tried:
unsqueze(-1) on output after I defined model like:
model = REvenuePredictor()
output = model(X_train, lengths).unsqueeze(-1)

Pytorch:Apply cross entropy loss with custom weight map

I am solving multi-class segmentation problem using u-net architecture in pytorch.
As specified in U-NET paper, I am trying to implement custom weight maps to counter class imbalances.
Below is the opertion which I want to apply -
Also, I reduced the batch_size=1 so that I can remove that dimension while passing it to precompute_to_masks function.
I tried the below approach-
def precompute_for_image(masks):
masks = masks.cpu()
cls = masks.unique()
res = torch.stack([torch.where(masks==cls_val, torch.tensor(1), torch.tensor(0)) for cls_val in cls])
return res
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
###################
# train the model #
###################
model.train()
for batch_idx, (data, target) in enumerate(final_train_loader):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
optimizer.zero_grad()
output = model(data)
temp_target = precompute_for_image(target)
w = weight_map(temp_target)
loss = criterion(output,target)
loss = w*loss
loss.backward()
optimizer.step()
train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
return model
where weight_map is the function to calculate weight mask which I got from here
The issue, I am facing is I am getting memory error when I apply the following method. I am using 61gb RAM and Tesla V100 GPU.
I really think I am applying it in incorrect way.
How to do it?
I am omitting the non-essential details from the training loop.
Below is my weight_map function:
from skimage.segmentation import find_boundaries
w0 = 10
sigma = 5
def make_weight_map(masks):
"""
Generate the weight maps as specified in the UNet paper
for a set of binary masks.
Parameters
----------
masks: array-like
A 3D array of shape (n_masks, image_height, image_width),
where each slice of the matrix along the 0th axis represents one binary mask.
Returns
-------
array-like
A 2D array of shape (image_height, image_width)
"""
nrows, ncols = masks.shape[1:]
masks = (masks > 0).astype(int)
distMap = np.zeros((nrows * ncols, masks.shape[0]))
X1, Y1 = np.meshgrid(np.arange(nrows), np.arange(ncols))
X1, Y1 = np.c_[X1.ravel(), Y1.ravel()].T
for i, mask in enumerate(masks):
# find the boundary of each mask,
# compute the distance of each pixel from this boundary
bounds = find_boundaries(mask, mode='inner')
X2, Y2 = np.nonzero(bounds)
xSum = (X2.reshape(-1, 1) - X1.reshape(1, -1)) ** 2
ySum = (Y2.reshape(-1, 1) - Y1.reshape(1, -1)) ** 2
distMap[:, i] = np.sqrt(xSum + ySum).min(axis=0)
ix = np.arange(distMap.shape[0])
if distMap.shape[1] == 1:
d1 = distMap.ravel()
border_loss_map = w0 * np.exp((-1 * (d1) ** 2) / (2 * (sigma ** 2)))
else:
if distMap.shape[1] == 2:
d1_ix, d2_ix = np.argpartition(distMap, 1, axis=1)[:, :2].T
else:
d1_ix, d2_ix = np.argpartition(distMap, 2, axis=1)[:, :2].T
d1 = distMap[ix, d1_ix]
d2 = distMap[ix, d2_ix]
border_loss_map = w0 * np.exp((-1 * (d1 + d2) ** 2) / (2 * (sigma ** 2)))
xBLoss = np.zeros((nrows, ncols))
xBLoss[X1, Y1] = border_loss_map
# class weight map
loss = np.zeros((nrows, ncols))
w_1 = 1 - masks.sum() / loss.size
w_0 = 1 - w_1
loss[masks.sum(0) == 1] = w_1
loss[masks.sum(0) == 0] = w_0
ZZ = xBLoss + loss
return ZZ
Traceback of the error-
MemoryError Traceback (most recent call last)
<ipython-input-30-f0a595b8de7e> in <module>
1 # train the model
2 model_scratch = train(20, final_train_loader, unet, optimizer,
----> 3 criterion, train_on_gpu, 'model_scratch.pt')
<ipython-input-29-b481b4f3120e> in train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path)
24 loss = criterion(output,target)
25 target.requires_grad = False
---> 26 w = make_weight_map(target)
27 loss = W*loss
28 loss.backward()
<ipython-input-5-e75a6281476f> in make_weight_map(masks)
33 X2, Y2 = np.nonzero(bounds)
34 xSum = (X2.reshape(-1, 1) - X1.reshape(1, -1)) ** 2
---> 35 ySum = (Y2.reshape(-1, 1) - Y1.reshape(1, -1)) ** 2
36 distMap[:, i] = np.sqrt(xSum + ySum).min(axis=0)
37 ix = np.arange(distMap.shape[0])
MemoryError:
Your final_train_loader provides you with an input image data and the expected pixel-wise labeling target. I assume (following pytorch's conventions) that data is of shape B-3-H-W and of dtype=torch.float.
More importantly, target is of shape B-H-W and of dtype=torch.long.
On the other hand make_weight_map expects its input to be C-H-W (with C = number of classes, NOT batch size), of type numpy array.
Try providing make_weight_map the input mask as it expects it and see if you get similar errors.
I also recommend that you visualize the resulting weight map - to make sure your function does what you expect it to do.

when call the class convolution, it say error

gdd.forward(x) call error, but why?
This code uses imcol to implement the convolution layer
Traceback (most recent call last):
File "E:/PycharmProjects/untitled2/kk.py", line 61, in <module>
gdd.forward(x)
File "E:/PycharmProjects/untitled2/kk.py", line 46, in forward
FN,C,FH,FW=self.W.shape
ValueError: not enough values to unpack (expected 4, got 2)
import numpy as np
class Convolution:
# 卷积核大小
def __init__(self,W,b,stride=1,pad=0):
self.W = W
self.b = b
self.stride = stride
self.pad = pad
def forward(self,x):
FN,C,FH,FW=self.W.shape
N,C,H,W = x.shape
out_h = int(1+(H+ 2*self.pad - FH) / self.stride)
out_w = int(1+(W + 2*self.pad -FW) / self.stride)
e = np.array([[2,0,1],[0,1,2],[1,0,2]])
x = np.array([[1,2,3,0],[0,1,2,3],[3,0,1,2],[2,3,0,1]])
gdd = Convolution(e,3,1,0)
gdd.forward(x)
not enough value to unpack means that there are 2 outputs, but you are expecting 4:
FN,C,FH,FW=self.W.shape
just get rid of 2 of them and you are good to go :)
BTW I'm assuming you speak Chinese? 我说中文, 不懂可以用中文问一下

Move an object towards a target in PyBullet

I'm pretty new to PyBullet and physics engines in general. My first step is trying to get one object to move towards another.
import pybullet as p
import time
import pybullet_data
DURATION = 10000
physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version
p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally
print("data path: %s " % pybullet_data.getDataPath())
p.setGravity(0,0,-10)
planeId = p.loadURDF("plane.urdf")
cubeStartPos = [0,0,1]
cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])
boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)
gemId = p.loadURDF("duck_vhacd.urdf", [2,2,1], p.getQuaternionFromEuler([0,0,0]) )
for i in range (DURATION):
p.stepSimulation()
time.sleep(1./240.)
gemPos, gemOrn = p.getBasePositionAndOrientation(gemId)
cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)
oid, lk, frac, pos, norm = p.rayTest(cubePos, gemPos)[0]
#rt = p.rayTest(cubePos, gemPos)
#print("rayTest: %s" % rt[0][1])
print("rayTest: Norm: ")
print(norm)
p.applyExternalForce(objectUniqueId=boxId, linkIndex=-1, forceObj=pos
,posObj=gemPos, flags=p.WORLD_FRAME)
print(cubePos,cubeOrn)
p.disconnect()
But this just gets R2 to wiggle a bit. How do I do this?
First of all, if you are moving a robot, you should do something a little more complicated, by providing some commands to the joints of the robot. Here is an example
Now assuming that you are moving something less complicated by applying an external force, the simplest thing you can do is multiply a factor alpha with the difference between the two positions; this would be your force.
For your example this would be:
import numpy as np
import pybullet as p
import time
import pybullet_data
DURATION = 10000
ALPHA = 300
physicsClient = p.connect(p.GUI) # or p.DIRECT for non-graphical version
p.setAdditionalSearchPath(pybullet_data.getDataPath()) # optionally
print("data path: %s " % pybullet_data.getDataPath())
p.setGravity(0, 0, -10)
planeId = p.loadURDF("plane.urdf")
cubeStartPos = [0, 0, 1]
cubeStartOrientation = p.getQuaternionFromEuler([0, 0, 0])
boxId = p.loadURDF("r2d2.urdf", cubeStartPos, cubeStartOrientation)
gemId = p.loadURDF("duck_vhacd.urdf", [
2, 2, 1], p.getQuaternionFromEuler([0, 0, 0]))
for i in range(DURATION):
p.stepSimulation()
time.sleep(1./240.)
gemPos, gemOrn = p.getBasePositionAndOrientation(gemId)
boxPos, boxOrn = p.getBasePositionAndOrientation(boxId)
force = ALPHA * (np.array(gemPos) - np.array(boxPos))
p.applyExternalForce(objectUniqueId=boxId, linkIndex=-1,
forceObj=force, posObj=boxPos, flags=p.WORLD_FRAME)
print('Applied force magnitude = {}'.format(force))
print('Applied force vector = {}'.format(np.linalg.norm(force)))
p.disconnect()

Getting dimensions wrong when creating a feed-forward auto-encoder in Theano/Lasagne

I want to create a simple autoencoder with 3000 input, 2 hidden and 3000 output neurons:
def build_autoencoder(input_var=None):
l_in = InputLayer(shape=(None,3000), input_var=input_var)
l_hid = DenseLayer(
l_in, num_units=2,
nonlinearity=rectify,
W=lasagne.init.GlorotUniform())
l_out = DenseLayer(
l_hid, num_units=3000,
nonlinearity=softmax)
return l_out
The shape of the training data is as follows:
train.shape = (3000,3)
This is input, target and loss function definition:
import sys
import os
import time
import numpy as np
import theano
import theano.tensor as T
import lasagne
from lasagne.updates import rmsprop
from lasagne.layers import DenseLayer, DropoutLayer, InputLayer
from lasagne.nonlinearities import rectify, softmax
from lasagne.objectives import categorical_crossentropy
# Creating the Theano variables
input_var = T.dmatrix('inputs')
target_var = T.dmatrix('targets')
# Building the Theano expressions on these variables
network = build_autoencoder(input_var)
prediction = lasagne.layers.get_output(network)
loss = categorical_crossentropy(prediction, target_var)
loss = loss.mean()
test_prediction = lasagne.layers.get_output(network,
deterministic=True)
test_loss = categorical_crossentropy(test_prediction, target_var)
test_loss = test_loss.mean()
test_acc = T.mean(T.eq(T.argmax(test_prediction, axis=1), target_var),
dtype=theano.config.floatX)
I'm just running one epoch but get an error:
params = lasagne.layers.get_all_params(network, trainable=True)
updates = rmsprop(loss, params, learning_rate=0.001)
# Compiling the graph by declaring the Theano functions
train_fn = theano.function([input_var, target_var],
loss, updates=updates)
val_fn = theano.function([input_var, target_var],
[test_loss, test_acc])
# For loop that goes each time through the hole training
# and validation data
print("Starting training...")
for epoch in range(1):
# Going over the training data
train_err = 0
train_batches = 0
start_time = time.time()
print 'test1'
train_err += train_fn(train, train)
train_batches += 1
# Going over the validation data
val_err = 0
val_acc = 0
val_batches = 0
err, acc = val_fn(train, train)
val_err += err
val_acc += acc
val_batches += 1
# Then we print the results for this epoch:
print("Epoch {} of {} took {:.3f}s".format(epoch + 1, num_epochs, time.time() - start_time))
print("training loss:\t\t{:.6f}".format(train_err / train_batches))
print("validation loss:\t\t{:.6f}".format(val_err / val_batches))
print("validation accuracy:\t\t{:.2f} %".format(val_acc / val_batches * 100))
This is the error:
ValueError: ('shapes (3000,3) and (3000,2) not aligned: 3 (dim 1) !=
3000 (dim 0)', (3000, 3), (3000, 2)) Apply node that caused the error:
Dot22(inputs, W) Toposort index: 3 Inputs types: [TensorType(float64,
matrix), TensorType(float64, matrix)] Inputs shapes: [(3000, 3),
(3000, 2)] Inputs strides: [(24, 8), (16, 8)] Inputs values: ['not
shown', 'not shown'] Outputs clients:
[[Elemwise{add,no_inplace}(Dot22.0, InplaceDimShuffle{x,0}.0),
Elemwise{Composite{(i0 * (Abs(i1) + i2 + i3))}}[(0,
2)](TensorConstant{(1, 1) of 0.5}, Elemwise{add,no_inplace}.0,
Dot22.0, InplaceDimShuffle{x,0}.0)]]
To me it seems that the bottleneck of the auto encoder is the problem. Any ideas?
Just got some help from my IBM college (Erwan), I've posted the working solution to this GIST, the relevant sections are these ones:
First, get the shape of the training data correct:
train.shape = (3, 3000)
Then use the same shape on the InputLayer:
def build_autoencoder(input_var=None):
l_in = InputLayer(shape=(3, 3000), input_var=input_var)
l_hid = DenseLayer(
l_in, num_units=2,
nonlinearity=rectify,
W=lasagne.init.GlorotUniform())
l_out = DenseLayer(
l_hid, num_units=3000,
nonlinearity=softmax)
return l_out
So this is solved, next problem is getting a descending cost during training, but this is another topic :)