LSTM with rnn cuda()? - cuda

I have the following model:
model = nn.Sequential()
model:add(nn.Sequencer(nn.LookupTable(nIndex, hiddenSize)))
model:add(nn.Sequencer(nn.FastLSTM(hiddenSize, hiddenSize, rho)))
model:add(nn.Sequencer(nn.Linear(hiddenSize, nIndex)))
model:add(nn.Sequencer(nn.LogSoftMax()))
then I put the model on cuda by:
model:cuda()
and I try to forward an input (cudatensor) and it breaks .
Is FastLSTM incompatible with cuda ?
the message:
[string "local f = function() return targets:cuda() en..."]:1: attempt to call method 'cuda' (a nil value)

I managed to introduce a few computations on cuda with the following changes:
- first put the model ans the criterion on cuda by :
model=model:cuda()
criterion=criterion:cuda()
-second I built a table of cuda tensor that I provided as targets by :
local targetscudatable={}
for i = 1, #targets do
table.insert(targetscudatable, targets[i]:cuda())
end
then it works, but I wonder if I can have more data sent to cuda, like the inputs. Anyway I already had a speed increase od 500% wich is not to bad

You forgot to require the cunn package :
require 'cunn'

Related

what is the idiomatic way to update *part* of a memory element in FIRRTL? this comes up when updating one entry of a line in a cache

Writing a register file in FIRRTL is straightforward: make a memory of machine words and read/write them.
However, when writing a cache, it is different: you typically have a cache line and when writing, only want to update part of the line, a single element of a line of the cache.
What is the idiomatic way to do this in FIRRTL? (And please do not point me at the Rocket implementation in Chisel as I find Chisel to be completely unreadable.)
I can think of at least two ways to do it:
(1) make the memory contain a Vector or Bundle and then select that member of the memory element, something like this:
cmem mem1 : {x:SInt<64>,y:SInt<64>}[4]
infer mport temp_x_mem1 = mem1[i].x, clock
temp_x_mem1 <= foo
(2) do some sort of read-modify-write, something like this:
cmem mem1 : {x:SInt<64>,y:SInt<64>}[4]
infer mport temp_x_mem1 = mem1[i], clock
bar <= temp_x_mem1
bar.x <= foo
infer mport temp_x_mem1_B = mem1[i], clock
temp_x_mem1_B <= bar
I am generating my FIRRTL from another format and I did not plan for this, so when I generate a memory, the only straightforward way is to read or write an entire memory element, not part of one. Therefore way (1) is difficult, but way (2) is straightforward. Would some layer of the FIRRTL optimizer or a subsequent Verilog optimizer make way (2) work as efficiently as way (1) if all of the code occurred in the same module?
The simplest approach would be to use a FIRRTL memory construct (mem) and avoid CHIRRTL entirely (cmem/smem). The former gives you an explicit mask port on the memory. This then enables you to describe a masked write which is exactly what you want. FIRRTL memory constructs have no Chisel API, but it sounds like you are using something else so this may not be an issue.
Approach (1) will not work as you can't use a part select to describe a memory port. (infer mport temp_x_mem1 = mem1[i].x, clock is illegal CHIRRTL.) Approach (2) will work, but you pay a cycle penalty to do it.
There is a third, idiomatic approach that involves describing the memory in such a way that a FIRRTL compiler will infer the mask. This is done by guarding the write behind a when statement which contains the enable:
circuit Foo:
module Foo:
input clock: Clock
input i: UInt<2>
input mask: {x: UInt<1>, y: UInt<1>}
input data: {x: SInt<64>, y: SInt<64>}
cmem mem1 : {x:SInt<64>,y:SInt<64>}[4]
infer mport temp_x_mem1 = mem1[i], clock
when eq(mask.x, UInt<1>(1)):
temp_x_mem1.x <= data.x
when eq(mask.y, UInt<1>(1)):
temp_x_mem1.y <= data.y
Either the Scala-based FIRRTL Compiler or the MLIR-based FIRRTL Compiler will infer the when conditions as the mask.
MLIR-based FIRRTL Compiler output:
circuit Foo :
module Foo :
input clock : Clock
input i : UInt<2>
input mask : { x : UInt<1>, y : UInt<1> }
input data : { x : SInt<64>, y : SInt<64> }
mem mem1 :
data-type => { x : SInt<64>, y : SInt<64> }
depth => 4
read-latency => 0
write-latency => 1
writer => temp_x_mem1
read-under-write => undefined
mem1.temp_x_mem1.addr is invalid
mem1.temp_x_mem1.en <= UInt<1>(0)
mem1.temp_x_mem1.clk is invalid
mem1.temp_x_mem1.data is invalid
mem1.temp_x_mem1.mask is invalid
mem1.temp_x_mem1.addr <= i
mem1.temp_x_mem1.en <= UInt<1>(1)
mem1.temp_x_mem1.clk <= clock
mem1.temp_x_mem1.mask.x <= UInt<1>(0)
mem1.temp_x_mem1.mask.y <= UInt<1>(0)
when eq(mask.x, UInt<1>(1)) :
mem1.temp_x_mem1.mask.x <= UInt<1>(1)
mem1.temp_x_mem1.data.x <= data.x
when eq(mask.y, UInt<1>(1)) :
mem1.temp_x_mem1.mask.y <= UInt<1>(1)
mem1.temp_x_mem1.data.y <= data.y

Writing a dropout layer using nn.Sequential() method + Pytorch

I am trying to create a Dropout Layer for my neural network using nn.Sequential() like this:
class DropoutLayer(nn.Module):
def __init__(self, p):
super().__init__()
self.p = p
def forward(self, input):
if self.training:
u1 = (np.random.rand(*input.shape)<self.p)
u1 *= input
return u1
else:
input *= self.p
model = nn.Sequential(Flatten(),DropoutLayer(p = 0.7),nn.LogSoftmax(dim = -1))
opt = torch.optim.Adam(modelDp.parameters(), lr=0.005)
train(modelDp, opt, 5)
But I get this error:
ValueError: optimizer got an empty parameter list
First, there is what I assume to be a small typo : you declare model = nn.Sequential(...) but then use modelDp.parameters(). I assume you just made a small copypaste mistake and these are actually the same thing.
This error is yielded because no layer in your model has trainable parameters, i.e parameters that will be affected by the gradient backpropagation step. Actually, this "network" cannot learn anything at all.
To get rid of the error and get an actual working neural network, you need to include the learning layers, which according to the previous error you had reported, are linear layers. That would be something like :
model = nn.Sequential(nn.Linear(784, 10), Flatten(), DropoutLayer(0.7), nn.LogSoftMax(dim=-1))
Now a couple additional remarks :
You may want to use the pytorch random tensors instead of Numpy's. It will be easier to deal with the devicewhen you will eventually want to move your network on GPU.
This code is going to yield another error as soon as you try it in eval mode because your second conditional branch in the forward method does not return anything. You may want to replace the instruction with return input * self.p

How to Force Tensorflow to Run under float16?

I am building up a sequential model by Keras with a custom activation function by defining a new class written by keras' tf backend and some tf's tensor operators themselves. I put the custom activation function in ../keras/advanced_activation.py.
I intend to run it using float16 precision. Without the custom function, I could use the following to choose between float32 and float16 easily:
if self.precision == 'float16':
K.set_floatx('float16')
K.set_epsilon(1e-4)
else:
K.set_floatx('float32')
K.set_epsilon(1e-7)
Yet when involving the custom function into my model, it seems tf persists in float32 even when I chose float16. I understand that tf is run under flat32 by default, so my questions are:
There are also several built-in activation functions in the same file, how does Keras make them run under float16 so that I might be able to do the same? There is a tf method tf.dtypes.cast(...), can I use it in my custom function to force tf? There is no such a cast in those build-in functions.
Alternatively, how can I force tf to run under float16 directly by using Keras with tf as the backend?
Many thanks.
I got the answer by debugging. The lesson is that
First, tf.dtypes.cast(...) works.
Second, I can specify a second argument into my custom activation function to indicate the data type of the cast(...). The following is the associated code
Third, we do not need tf.constant to indicate the data type of those constants
Fourth, I conclude that adding a custom function in custom_activation.py is the easiest way to define our own layer/activation, as long as it is differentiable everywhere, or at least piece-wisely differentiable and has no discontinuity at junctures.
# Quadruple Piece-Wise Constant Function
class MyFunc(Layer):
def __init__(self, sharp=100, DataType = 'float32', **kwargs):
super(MyFunc, self).__init__(**kwargs)
self.supports_masking = True
self.sharp = K.cast_to_floatx(sharp)
self.DataType = DataType
def call(self, inputs):
inputss = tf.dtypes.cast(inputs, dtype=self.DataType)
orig = inputss
# some calculations
return # my_results
def get_config(self):
config = {'sharp': float(self.sharp),
'DataType': self.DataType}
base_config = super(MyFunc, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def compute_output_shape(self, input_shape):
return input_shape
Thanks #y.selivonchyk for your worthy discussion with me, and #Yolo Swaggins for your contribution.

mxnet failed to infer type

My mxnet code - which consists of a series of complex connections and slicing raises the following error:
Error in operator concat0: [03:03:51] src/operator/./concat-inl.h:211: Not enough information to infer type in Concat.
Im not sure how to interpret that or what information to provide to help debug it. Concat0 is part of the operation:
# Define take_column function as transpose(take(transpose(x), i))
for i in range(47):
y_hat_lt = take_column(y_hat,
mx.sym.concat(mx.sym.slice(some_indices, begin=i, end=i+1), self.label_dim + mx.sym.slice(some_indices, begin=i, end=i+1), dim=0))
here some_indices is a variable which I fix to be a list. Do let me know!
It looks like MXNet is not able to infer the shape of output. Did you specify the shape for variable some_indices?
e.g. some_indices = mx.sym.var('indices', shape=(1,1))
It would be nice if you can paste a minimum reproducible code :)
Instead of taking transpose, swapping among the axis resolved the issue.
def ttake( x, i ):
""" Take from axis 1 instead of 0.
"""
a = mx.sym.swapaxes(x, dim1=0, dim2=1)
return mx.sym.flatten( mx.sym.transpose( mx.sym.take( a , i ) ) )

Error in mcp2matrix(model, linfct = linfct)

I don't understand why it is not working for the post hoc test. What did I do wrong?
modmisto<-lme(Cobertura~Tratamento, random=~1|Parcela, data=Cover_BraquiT3)
summary(modmisto)
tukey<-glht(modmisto, mcp(Tratamento="Tukey"))
Error in mcp2matrix(model, linfct = linfct) :
Variable(s) ‘Tratamento’ of class ‘character’ is/are not contained as a factor in ‘model’.
Any help with this will be very appreciated!
Tratatmento does not seem to be a factor variable, try put this before:
Cover_BraquiT3$Tratamento = as.factor(Cover_BraquiT3$Tratamento)
A variable in my data.frame was of type character.
The glht function did not recognize it as a factor in the model generated by the glm function.
I Tried:
variable = as.factor (variable)
I only managed using:
library (tibble)
data <-as_tibble (data)%>%
mutate (variable = factor (variable))