Can a PyTorch DataLoader start with an empty dataset? - deep-learning

I have a dataset which is in a deque buffer, and I want to load random batches from this with a DataLoader. The buffer starts empty. Data will be added to the buffer before the buffer is sampled from.
self.buffer = deque([], maxlen=capacity)
self.batch_size = batch_size
self.loader = DataLoader(self.buffer, batch_size=batch_size, shuffle=True, drop_last=True)
However, this causes the following error:
File "env/lib/python3.8/site-packages/torch_geometric/loader/dataloader.py", line 78, in __init__
super().__init__(dataset, batch_size, shuffle,
File "env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 268, in __init__
sampler = RandomSampler(dataset, generator=generator)
File "env/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 102, in __init__
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
Turns out that the RandomSampler class checks that num_samples is positive when it is initialised, which causes the error.
if not isinstance(self.num_samples, int) or self.num_samples <= 0:
raise ValueError("num_samples should be a positive integer "
"value, but got num_samples={}".format(self.num_samples))
Why does it check for this here, even though RandomSampler does support datasets which change in size at runtime?
One workaround is to use an IterableDataset, but I want to use the shuffle functionality of DataLoader.
Can you think of a nice way to use a DataLoader with a deque? Much appreciated!

The problem here is neither the usage of deque nor the fact that the dataset is dynamically growable. The problem is that you are starting with a Dataset of size zero - which is invalid.
The easiest solution would be to just start with any arbitrary object in the deque and dynamically remove it afterwards.

Related

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation:

I am using an LSTM to summarize a trajectory as shown below:
class RolloutEncoder(nn.Module):
def __init__(self, config):
super(RolloutEncoder, self).__init__()
self._input_size = (
2048 + 1
) # deter_state + imag_reward; fix and use config["deter_dim"] + 1
self._hidden_size = config["rollout_enc_size"]
self._lstm = nn.LSTM(self._input_size, self._hidden_size, bias=True)
def forward(self, traj):
features = traj["features_pred"]
rewards = traj["reward_pred"].unsqueeze(1)
input = torch.cat((features, rewards), dim=2)
encoding, (h_n, c_n) = self._lstm(input)
code = h_n.squeeze(0)
return code
My training loop is something like:
encoder = RolloutEncoder(config)
for e in range(episodes):
for step in range(steps):
print(f"Step {steps})
# calc traj
code = encoder(traj)
# some operations that do not modify code but only concat it with some other tensor
# calc loss
opt.zero_grad()
loss.backward()
opt.step()
On running, I get this error:
Step 0
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 7
Step 8
Step 9
Step 10
Step 11
Step 12
Step 13
Step 14
Traceback (most recent call last):
File "/path/main.py", line 351, in <module>
agent_loss.backward()
File "/home/.conda/envs/abc/lib/python3.9/site-packages/torch/tensor.py", line 245, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/user/.conda/envs/abc/lib/python3.9/site-packages/torch/autograd/__init__.py", line 145, in backward
Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 2049]] is at version 8; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
On setting the anomaly_detection to True, it point to this line in the encoder definition:
encoding, (h_n, c_n) = self._lstm(input)
This is a very common error but I am not using any inplace operation. And the error occurs after running some steps successfully which is really weird. On inspecting, I found that the [16, 2049] tensor is one of the weights of the LSTM. I also tried using dummy random tensors in place of features and rewards but the error persists, suggesting that the traj calculation has nothing to do with this error. What might be the reason for this error?

Attempting to capture an EagerTensor without building a function in tf 2.0

I want to build an asynchronous advantage actor-critic model (a3c) of an agent with multiple actions in tensorflow 2.0. Some actions have continuous actions, on the other hand, others have discrete actions.
For these actions, I use tfp.distributions.MultivariateNormalDiag library in tensorflow-probability package. But I spent two days struggling this. But I don't know how to build a network to get the value of multiple actions.
I built a function to make distributions for multiple actions and I input logit tensor (output tensor of actor network) to this function below. The function will return distributions of each action.
def make_dist(space, logits):
if space.is_continuous():
mu, logstd = tf.split(logits, 2, axis=-1)
return tfp.distributions.MultivariateNormalDiag(mu, tf.exp(logstd))
else:
return tfp.distributions.Categorical(logits)
For the first time, I tested the environment with one continuous action. When I call 'sample' function of this distribution, The error is like the following.
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_probability\python\distributions\distribution.py", line 848, in sample
return self._call_sample_n(sample_shape, seed, name, **kwargs)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_probability\python\distributions\transformed_distribution.py", line 373, in _call_sample_n
x = self._sample_n(n, seed, **distribution_kwargs)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_probability\python\distributions\transformed_distribution.py", line 353, in _sample_n
**distribution_kwargs)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_probability\python\distributions\distribution.py", line 848, in sample
return self._call_sample_n(sample_shape, seed, name, **kwargs)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_probability\python\distributions\distribution.py", line 826, in _call_sample_n
samples = self._sample_n(n, seed, **kwargs)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_probability\python\distributions\normal.py", line 185, in _sample_n
axis=0)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\ops\array_ops.py", line 1431, in concat
return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 1257, in concat_v2
"ConcatV2", values=values, axis=axis, name=name)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 481, in _apply_op_helper
value, as_ref=input_arg.is_ref)
File "C:\Users\SDS-1\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1264, in internal_convert_to_tensor
raise RuntimeError("Attempting to capture an EagerTensor without "
RuntimeError: Attempting to capture an EagerTensor without building a function.
I used to use keras and pytorch. I am a newbie of tensorflow 2.0. As far as I know, the placeholder in tf 1.x was deprecated because of eager execution. The problem is related to that.

Why/how can model.forward() succeed both on input being mini-batch vs single item?

Why and how does this work?
When I run the forward phase on input
being mini-batch tensor
or alternatively being a single input item
model.__call__() (which AFAIK is calling forward() ) swallows that and spills out adequate output (i.e. a tensor of mini-batch of estimates or a single item of estimate)
Adopting testcode from the Pytorch NN example shows what I mean, but I don't get it.
I would expect it to create problems and me forced to transform the single item input into a mini-batch of size 1( reshape (1,xxx)) or likewise, like I did in the code below.
( I did variations of the test to be sure it is e.g. not depending on execution order )
# -*- coding: utf-8 -*-
import torch
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
#N, D_in, H, D_out = 64, 1000, 100, 10
N, D_in, H, D_out = 64, 10, 4, 3
# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. Each Linear Module computes output from input using a
# linear function, and holds internal Tensors for its weight and bias.
model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out),
)
# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')
learning_rate = 1e-4
for t in range(1):
# Forward pass: compute predicted y by passing x to the model. Module objects
# override the __call__ operator so you can call them like functions. When
# doing so you pass a Tensor of input data to the Module and it produces
# a Tensor of output data.
model.eval()
print ("###########")
print ("x[0]",x[0])
print ("x[0].size()", x[0].size())
y_1pred = model(x[0])
print ("y_1pred.size()", y_1pred.size())
print (y_1pred)
model.eval()
print ("###########")
print ("x.size()", x.size())
y_pred = model(x)
print ("y_pred.size()", y_pred.size())
print ("y_pred[0]", y_pred[0])
print ("###########")
model.eval()
input_item = x[0]
batch_len1_shape = torch.Size([1,*(input_item.size())])
batch_len1 = input_item.reshape(batch_len1_shape)
y_pred_batch_len1 = model(batch_len1)
print ("input_item",input_item)
print ("input_item.size()", input_item.size())
print ("y_pred_batch_len1.size()", y_pred_batch_len1.size())
print (y_1pred)
raise Exception
This is the output it generates:
###########
x[0] tensor([-1.3901, -0.2659, 0.4352, -0.6890, 0.1098, -0.3124, 0.6419, 1.1004,
-0.7910, -0.5389])
x[0].size() torch.Size([10])
y_1pred.size() torch.Size([3])
tensor([-0.5366, -0.4826, 0.0538], grad_fn=<AddBackward0>)
###########
x.size() torch.Size([64, 10])
y_pred.size() torch.Size([64, 3])
y_pred[0] tensor([-0.5366, -0.4826, 0.0538], grad_fn=<SelectBackward>)
###########
input_item tensor([-1.3901, -0.2659, 0.4352, -0.6890, 0.1098, -0.3124, 0.6419, 1.1004,
-0.7910, -0.5389])
input_item.size() torch.Size([10])
y_pred_batch_len1.size() torch.Size([1, 3])
tensor([-0.5366, -0.4826, 0.0538], grad_fn=<AddBackward0>)
The docs on nn.Linear state that
Input: (N,∗,in_features) where ∗ means any number of additional dimensions
so one would naturally expect that at least two dimensions are necessary. However, if we look under the hood we will see that Linear is implemented in terms of nn.functional.linear, which dispatches to torch.addmm or torch.matmul (depending whether bias == True) which broadcast their argument.
So this behavior is likely a bug (or an error in documentation) and I would not depend on it working in the future, if I were you.

Counting non blank and sum of length of lines in python

Am trying to create a function that takes a filename and it returns a 2-tuple with the number of the non-empty lines in that program, and the sum of the lengths of all those lines. Here is my current program. I made an attempy and got the following code:
def code_metric(file_name):
with open(file_name) as f:
lines = f.read().splitlines()
char_count = sum(map(len,(map(str.strip,filter(None,lines)))))
return len(lines), char_count
Am supposed to use the functionals map, filter, and reduce for this. I had asked the question previously and improved on my answer but its still giving me an error. Here is the link to the previous version of the question:
Old program code
When I run the file cmtest.py which has the following content
import prompt,math
x = prompt.for_int('Enter x')
print(x,'!=',math.factorial(x),sep='')
the result should be
(3,85)
but I keep getting:
(4,85)
Another file colltaz.py to be tested for example:
the result should be:
(73, 2856)
bit I keep getting:
(59, 2796)
Here is a link to the collatz.py file:
Collatz.py file link
Can anyone help me with correcting the code. Am fairly new to python and any help would be great.
Try this:
def code_metric(file_name):
with open(file_name) as f:
lines = [line.rstrip() for line in f.readlines()]
nonblanklines = [line for line in lines if line]
return len(nonblanklines), sum(len(line) for line in nonblanklines)
Examples:
>>> code_metric('collatz.py')
(73, 2856)
>>> code_metric('cmtest.py')
(3, 85)
Discussion
I was able to achieve the desired result for collatz.py only by removing the trailing newline and trailing blanks off the end of the lines. That is done in this step:
lines = [line.rstrip() for line in f.readlines()]
The next step is to remove the blank lines:
nonblanklines = [line for line in lines if line]
We want to return the number of non-blank lines:
len(nonblanklines)
We also want to return the total number of characters on the non-blank lines:
sum(len(line) for line in nonblanklines)
Alternate Version for Large Files
This version does not require keeping the file in memory all at once:
def code_metric2(file_name):
with open(file_name) as f:
lengths = [len(line) for line in (line.rstrip() for line in f.readlines()) if line]
return len(lengths), sum(lengths)
Alternate Version Using reduce
Python's createor, Guido van Rossum, wrote this about the reduce builtin:
So now reduce(). This is actually the one I've always hated most,
because, apart from a few examples involving + or *, almost every time
I see a reduce() call with a non-trivial function argument, I need to
grab pen and paper to diagram what's actually being fed into that
function before I understand what the reduce() is supposed to do. So
in my mind, the applicability of reduce() is pretty much limited to
associative operators, and in all other cases it's better to write out
the accumulation loop explicitly.
Accordingly reduce is no longer a builtin in python3. For compatibility, though, it remains available in the functools module. The code below how reduce can be used for this particular problem:
from functools import reduce
def code_metric3(file_name):
with open(file_name) as f:
lengths = [len(line) for line in (line.rstrip() for line in f.readlines()) if line]
return len(lengths), reduce(lambda x, y: x+y, lengths)
Here is yet another version which makes heavier use of reduce:
from functools import reduce
def code_metric4(file_name):
def fn(prior, line):
nlines, length = prior
line = line.rstrip()
if line:
nlines += 1
length += len(line)
return nlines, length
with open(file_name) as f:
nlines, length = reduce(fn, f.readlines(), (0, 0))
return nlines, length

Fitting logistic regression with PyMC: ZeroProbability error

To teach myself PyMC I am trying to define a simple logistic regression. But I get a ZeroProbability error, and does not understand exactly why this happens or how to avoid it.
Here is my code:
import pymc
import numpy as np
x = np.array([85, 95, 70, 65, 70, 90, 75, 85, 80, 85])
y = np.array([1., 1., 0., 0., 0., 1., 1., 0., 0., 1.])
w0 = pymc.Normal('w0', 0, 0.000001) # uninformative prior (any real number)
w1 = pymc.Normal('w1', 0, 0.000001) # uninformative prior (any real number)
#pymc.deterministic
def logistic(w0=w0, w1=w1, x=x):
return 1.0 / (1. + np.exp(-(w0 + w1 * x)))
observed = pymc.Bernoulli('observed', logistic, value=y, observed=True)
And here is the trace back with the error message:
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/IPython/core/interactiveshell.py", line 2883, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-43ed68985dd1>", line 24, in <module>
observed = pymc.Bernoulli('observed', logistic, value=y, observed=True)
File "/usr/local/lib/python2.7/site-packages/pymc/distributions.py", line 318, in __init__
**arg_dict_out)
File "/usr/local/lib/python2.7/site-packages/pymc/PyMCObjects.py", line 772, in __init__
if not isinstance(self.logp, float):
File "/usr/local/lib/python2.7/site-packages/pymc/PyMCObjects.py", line 929, in get_logp
raise ZeroProbability(self.errmsg)
ZeroProbability: Stochastic observed's value is outside its support,
or it forbids its parents' current values.
I suspect np.exp to be causing the trouble, since it returns inf when the linear equation becomes too high.
I know there are other ways to define a logistic regression using PyMC (her is one), but I am interested in knowing why this approach does not work, and how I can define the regression using the Bernoulli object instead of using bernoulli_like
When you create a your normal stochastastic with pymc.Normal('w0', 0, 0.000001), PyMC2 initializes the value with a random draw from the prior distribution. Since your prior is so diffuse, this can be a value which is so unlikely that the posterior is effectively zero. To fix, just request a reasonable initial value for your Normal:
w0 = pymc.Normal('w0', 0, 0.000001, value=0)
w1 = pymc.Normal('w1', 0, 0.000001, value=0)
Here is a notebook with a few more details.
You have to put some sort of bound on the probability returned by the logistic function.
Maybe something like
#pymc.deterministic
def logistic(w0=w0, w1=w1, x=x):
tol = 1e-9
res = 1.0 / (1. + np.exp(-(w0 + w1 * x)))
return np.maximum(np.minimum(res, 1 - tol), tol)
I think you forgot the negative inside the exp() function, too.
#hahdawg's answer is good, but here's something else to consider.
For your uninformative priors on w0 and w1 I would first do an eyeball fit and then use uniforms with limits.
Obviously your w1 is going to be around 1/15 = .07, so a range like .04 to 1.2 might do it.
w0 is going to be in the range of -80/15 = -5.3, so something like -7 to -3 could do it.
I'm just saying this because exp can easily go bananas, so you have to be careful what you feed it.
If your inverse logit function comes out with a value too close to 0 or 1, logistic regression is guaranteed to break.
Out of curiosity, are you using a thin argument in your call to sample? There was a bug related to that, and it may be the culprit here.
Besides, thinning is not worthwhile in any case.