Cuda Out of Memory Error even after reducing the batch size

Cuda Out of Memory Error even after reducing the batch size - deep-learning

I'm trying to train a deep learning model in Azure Notebook which uses GPU from the DSVM-Ubuntu 18.04 which consists of Standard NC6 (6 vcpus, 56 GiB memory) and is getting the following Error:
RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 11.17 GiB total capacity; 10.76 GiB already allocated; 50.31 MiB free; 10.84 GiB reserved in total by PyTorch)
I've searched on this regard and couldn't find any solution in any of the questions on the web. And '10.84 GiB reserved in total by PyTorch' in the error message caught my attention, whether this can be configured to have a low memory value? I would like to receive any opinions in this regard. thank you.
This is my Code for Fine-tuning/training
for epoch in range(EPOCHS):
for idx,article in tqdm_notebook(enumerate(article_loader)):
article_tens = torch.tensor(tokenizer.encode(article[0], max_length=1024)).unsqueeze(0).to(device)
outputs = model(article_tens, labels=article_tens)
train_loss, prediction_scores = outputs[:2]
train_loss.backward()
train_sum_loss = train_sum_loss + train_loss.detach().data
iteration_count=idx
article_count = article_count + 1
if article_count == BATCH_SIZE:
article_count = 0
batch_count += 1
optimizer.step()
scheduler.step()
optimizer.zero_grad()
model.zero_grad()
Whole Stack-trace of the error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-11-2c74a22e42f7> in <module>
20 article_tens = torch.tensor(tokenizer.encode(article[0], max_length=1024)).unsqueeze(0).to(device)
21
---> 22 outputs = model(article_tens, labels=article_tens)
23
24 train_loss, prediction_scores = outputs[:2]
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/transformers/modeling_gpt2.py in forward(self, input_ids, past, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels, use_cache)
602 head_mask=head_mask,
603 inputs_embeds=inputs_embeds,
--> 604 use_cache=use_cache,
605 )
606 hidden_states = transformer_outputs[0]
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/transformers/modeling_gpt2.py in forward(self, input_ids, past, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, use_cache)
486 attention_mask=attention_mask,
487 head_mask=head_mask[i],
--> 488 use_cache=use_cache,
489 )
490
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/transformers/modeling_gpt2.py in forward(self, x, layer_past, attention_mask, head_mask, use_cache)
240
241 x = x + a
--> 242 m = self.mlp(self.ln_2(x))
243 x = x + m
244
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/transformers/modeling_gpt2.py in forward(self, x)
215
216 def forward(self, x):
--> 217 h = self.act(self.c_fc(x))
218 h2 = self.c_proj(h)
219 return self.dropout(h2)
/anaconda/envs/py37_pytorch/lib/python3.7/site-packages/transformers/activations.py in gelu_new(x)
27 Also see https://arxiv.org/abs/1606.08415
28 """
---> 29 return 0.5 * x * (1.0 + torch.tanh(math.sqrt(2.0 / math.pi) * (x + 0.044715 * torch.pow(x, 3.0))))
30
31
RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 11.17 GiB total capacity; 10.74 GiB already allocated; 320.00 KiB free; 10.89 GiB reserved in total by PyTorch)

Related

Not able to read csv, error: List index out of range

It seems like my excel in the format of a csv-file is not properly read as a csv-file in jupyterlab. I have the same code as all of my classmates, but they do not get any error. Could there be something wrong in my excel-settings, or is there mainly a problem with my code?
** This is my code:**
Use_raw = pd.read_csv("U15_US_2021.csv", header=3, index_col=1, na_values='---').drop('Unnamed: 0', axis=1)
Use=Use_raw.iloc[0:17,0:15].fillna(0)
ValueAdded=Use_raw.iloc[18:22,0:15].fillna(0)
FinalDemand=Use_raw.iloc[0:17,16:21].fillna(0)
Supply_raw = pd.read_csv('S15_US_2021.csv', header=3, index_col=1,na_values='---')
Supply=Supply_raw.iloc[0:17,1:16].fillna(0)
The error:
IndexError Traceback (most recent call last)
Cell In[23], line 1
----> 1 Use_raw = pd.read_csv("U15_US_2021.csv", header=3, index_col=1, na_values='--- ').drop('Unnamed: 0', axis=1)
2 Use=Use_raw.iloc[0:17,0:15].fillna(0)
3 ValueAdded=Use_raw.iloc[18:22,0:15].fillna(0)
File ~\anaconda3\envs\IO\lib\site-packages\pandas\util\_decorators.py:211, in deprecate_kwarg. <locals>._deprecate_kwarg.<locals>.wrapper(*args, **kwargs)
209 else:
210 kwargs[new_arg_name] = new_arg_value
--> 211 return func(*args, **kwargs)
File ~\anaconda3\envs\IO\lib\site-packages\pandas\util\_decorators.py:331, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
325 if len(args) > num_allow_args:
326 warnings.warn(
327 msg.format(arguments=_format_argument_list(allow_args)),
328 FutureWarning,
329 stacklevel=find_stack_level(),
330 )
--> 331 return func(*args, **kwargs)
File ~\anaconda3\envs\IO\lib\site-packages\pandas\io\parsers\readers.py:950, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
935 kwds_defaults = _refine_defaults_read(
936 dialect,
937 delimiter,
(...)
946 defaults={"delimiter": ","},
947 )
948 kwds.update(kwds_defaults)
--> 950 return _read(filepath_or_buffer, kwds)
File ~\anaconda3\envs\IO\lib\site-packages\pandas\io\parsers\readers.py:605, in _read(filepath_or_buffer, kwds)
602 _validate_names(kwds.get("names", None))
604 # Create the parser.
--> 605 parser = TextFileReader(filepath_or_buffer, **kwds)
607 if chunksize or iterator:
608 return parser
File ~\anaconda3\envs\IO\lib\site-packages\pandas\io\parsers\readers.py:1442, in TextFileReader.__init__(self, f, engine, **kwds)
1439 self.options["has_index_names"] = kwds["has_index_names"]
1441 self.handles: IOHandles | None = None
-> 1442 self._engine = self._make_engine(f, self.engine)
File ~\anaconda3\envs\IO\lib\site-packages\pandas\io\parsers\readers.py:1753, in TextFileReader._make_engine(self, f, engine)
1750 raise ValueError(msg)
1752 try:
-> 1753 return mapping[engine](f, **self.options)
1754 except Exception:
1755 if self.handles is not None:
File ~\anaconda3\envs\IO\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py:174, in CParserWrapper.__init__(self, src, **kwds)
164 if self._reader.leading_cols == 0 and is_index_col(
165 self.index_col # type: ignore[has-type]
166 ):
168 self._name_processed = True
169 (
170 index_names,
171 # error: Cannot determine type of 'names'
172 self.names, # type: ignore[has-type]
173 self.index_col,
--> 174 ) = self._clean_index_names(
175 # error: Cannot determine type of 'names'
176 self.names, # type: ignore[has-type]
177 # error: Cannot determine type of 'index_col'
178 self.index_col, # type: ignore[has-type]
179 )
181 if self.index_names is None:
182 self.index_names = index_names
File ~\anaconda3\envs\IO\lib\site-packages\pandas\io\parsers\base_parser.py:999, in ParserBase._clean_index_names(self, columns, index_col)
997 break
998 else:
--> 999 name = cp_cols[c]
1000 columns.remove(name)
1001 index_names.append(name)
IndexError: list index out of range

ValueError while using CTGAN library generating the sample

I am using CTGAN library on colab notebook. I have passed on a tabular dataset, with one categorical feature. I have mentioned the categorical feature as given in dcumentation. The model training is also complete without error. I am getting ValueError, while generating a simulated data.
How to resolve this error
Adding a reproducible code below
import pandas as pd
import numpy as np
import seaborn as sns
from ctgan import CTGAN
iris = sns.load_dataset('iris')
iris.head()
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
le.fit(iris['species'].unique())
iris['species'] = pd.DataFrame(le.transform(iris['species']))
data = iris.copy()
ctgan_model = CTGAN(epochs=2,batch_size=50,verbose = True)
ctgan_model.fit(data)
n_ctgan_generated_data = 2000
synthetic_data = ctgan.sample(n_ctgan_generated_data)
Complete error message
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-199b6dc04389> in <module>
1 n_ctgan_generated_data = 2000
----> 2 synthetic_data = ctgan.sample(n_ctgan_generated_data)
6 frames
/usr/local/lib/python3.8/dist-packages/ctgan/synthesizers/base.py in wrapper(self, *args, **kwargs)
48 def wrapper(self, *args, **kwargs):
49 if self.random_states is None:
---> 50 return function(self, *args, **kwargs)
51
52 else:
/usr/local/lib/python3.8/dist-packages/ctgan/synthesizers/ctgan.py in sample(self, n, condition_column, condition_value)
475 data = data[:n]
476
--> 477 return self._transformer.inverse_transform(data)
478
479 def set_device(self, device):
/usr/local/lib/python3.8/dist-packages/ctgan/data_transformer.py in inverse_transform(self, data, sigmas)
211 column_data = data[:, st:st + dim]
212 if column_transform_info.column_type == 'continuous':
--> 213 recovered_column_data = self._inverse_transform_continuous(
214 column_transform_info, column_data, sigmas, st)
215 else:
/usr/local/lib/python3.8/dist-packages/ctgan/data_transformer.py in _inverse_transform_continuous(self, column_transform_info, column_data, sigmas, st)
185 def _inverse_transform_continuous(self, column_transform_info, column_data, sigmas, st):
186 gm = column_transform_info.transform
--> 187 data = pd.DataFrame(column_data[:, :2], columns=list(gm.get_output_sdtypes()))
188 data.iloc[:, 1] = np.argmax(column_data[:, 1:], axis=1)
189 if sigmas is not None:
/usr/local/lib/python3.8/dist-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
670 )
671 else:
--> 672 mgr = ndarray_to_mgr(
673 data,
674 index,
/usr/local/lib/python3.8/dist-packages/pandas/core/internals/construction.py in ndarray_to_mgr(values, index, columns, dtype, copy, typ)
322 )
323
--> 324 _check_values_indices_shape_match(values, index, columns)
325
326 if typ == "array":
/usr/local/lib/python3.8/dist-packages/pandas/core/internals/construction.py in _check_values_indices_shape_match(values, index, columns)
391 passed = values.shape
392 implied = (len(index), len(columns)-1)
--> 393 raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}")
394
395
ValueError: Shape of passed values is (2000, 2), indices imply (2000, 3)
Is this issue from the library, where I have to change any source code?

I have moved input data to GPU but still can not train the model

Error: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
I tried check model and input data's place and i get device='cuda 0' for them, but when i print get_device() I get -1, I do not know if this is wrong.
I also tried both .to(device) and .cuda(), all not working
Here is the link of this code , https://www.kaggle.com/code/dongjj/dataloader/edit/run/106753596
Thanks in advance!!!
from torch.utils.tensorboard import SummaryWriter
model = base_model
model.cuda()
optimizer = torch.optim.Adam(model.parameters(), lr=5e-5)
loss_fn = F.cross_entropy
def train_one_epoch(epoch_index, tb_writer):
running_loss = 0.
last_loss = 0.
for i, data in enumerate(train_loader):
# Every data instance is an input + label pair
inputs, labels = data
inputs.cuda()
print(inputs.get_device())
labels.cuda()
# Zero your gradients for every batch!
optimizer.zero_grad()
# Make predictions for this batch
outputs = model(inputs)
# Compute the loss and its gradients
loss = loss_fn(outputs, labels)
loss.backward()
# Adjust learning weights
optimizer.step()
# Gather data and report
running_loss += loss.item()
if i % 1000 == 999:
last_loss = running_loss / 1000 # loss per batch
print(' batch {} loss: {}'.format(i + 1, last_loss))
tb_x = epoch_index * len(training_loader) + i + 1
tb_writer.add_scalar('Loss/train', last_loss, tb_x)
running_loss = 0.
return last_loss
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
writer = SummaryWriter('runs/fashion_trainer_{}'.format(timestamp))
epoch_number = 0
EPOCHS = 5
best_vloss = 1_000_000.
for epoch in range(EPOCHS):
print('EPOCH {}:'.format(epoch_number + 1))
model.train(True)
avg_loss = train_one_epoch(epoch_number, writer)
# We don't need gradients on to do reporting
model.train(False)
running_vloss = 0.0
for i, vdata in enumerate(test_loader):
vinputs, vlabels = vdata
voutputs = model(vinputs)
vloss = loss_fn(voutputs, vlabels)
running_vloss += vloss
avg_vloss = running_vloss / (i + 1)
print('LOSS train {} valid {}'.format(avg_loss, avg_vloss))
# Log the running loss averaged per batch
# for both training and validation
writer.add_scalars('Training vs. Validation Loss',
{ 'Training' : avg_loss, 'Validation' : avg_vloss },
epoch_number + 1)
writer.flush()
# Track best performance, and save the model's state
if avg_vloss < best_vloss:
best_vloss = avg_vloss
model_path = 'model_{}_{}'.format(timestamp, epoch_number)
torch.save(model.state_dict(), model_path)
epoch_number += 1
EPOCH 1:
-1
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/tmp/ipykernel_17/1115936933.py in <module>
55 print('EPOCH {}:'.format(epoch_number + 1))
56 model.train(True)
---> 57 avg_loss = train_one_epoch(epoch_number, writer)
58
59 # We don't need gradients on to do reporting
/tmp/ipykernel_17/1115936933.py in train_one_epoch(epoch_index, tb_writer)
25
26 # Make predictions for this batch
---> 27 outputs = model(inputs)
28
29 # Compute the loss and its gradients
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.7/site-packages/torchvision/models/resnet.py in forward(self, x)
281
282 def forward(self, x: Tensor) -> Tensor:
--> 283 return self._forward_impl(x)
284
285
/opt/conda/lib/python3.7/site-packages/torchvision/models/resnet.py in _forward_impl(self, x)
264 def _forward_impl(self, x: Tensor) -> Tensor:
265 # See note [TorchScript super()]
--> 266 x = self.conv1(x)
267 x = self.bn1(x)
268 x = self.relu(x)
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
445
446 def forward(self, input: Tensor) -> Tensor:
--> 447 return self._conv_forward(input, self.weight, self.bias)
448
449 class Conv3d(_ConvNd):
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
442 _pair(0), self.dilation, self.groups)
443 return F.conv2d(input, weight, bias, self.stride,
--> 444 self.padding, self.dilation, self.groups)
445
446 def forward(self, input: Tensor) -> Tensor:
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

You assume that cuda is an inplace function when it is actually doing a copy of the data. In other words, you need to reassign your input and labels:
inputs = inputs.cuda()
labels = labels.cuda()

json decode issue backtesting with zipline

I'm running the test code below on ubuntu server. I'm trying to back test a simple buy and hold strategy with zipline. I'm just trying to make sure I've got everything installed correctly to back test. I'm pretty new to zipline. I'm getting the error below and not sure why.
I'm following the steps in this blog post:
https://towardsdatascience.com/introduction-to-backtesting-trading-strategies-7afae611a35e
I'm running the code in a jupyter notebook.
when I run the code before to verify the bundle has been ingested this is what it shows:
code:
!zipline bundles
output:
apple-prices-2017-2019 2020-06-06 20:33:05.356288
apple-prices-2017-2019 2020-06-05 06:33:39.834841
apple-prices-2017-2019 2020-06-05 06:29:53.091904
apple-prices-2017-2019 2020-06-05 06:26:56.583051
csvdir <no ingestions>
quandl 2020-06-06 20:25:58.737940
quandl 2020-06-06 20:18:01.977412
quantopian-quandl 2020-06-04 16:15:57.717373
quantopian-quandl 2020-05-29 05:59:54.967114
the code I'm running in the jupyter notebook is below, along with the error. does anyone see what the issue is, and can you suggest how to fix?
code:
%%zipline --start 2016-1-1 --end 2017-12-31 --capital-base 1050.0 -o buy_and_hold.pkl
# imports
from zipline.api import order, symbol, record
# parameters
selected_stock = 'AAPL'
n_stocks_to_buy = 10
def initialize(context):
context.has_ordered = False
def handle_data(context, data):
# record price for further inspection
record(price=data.current(symbol(selected_stock), 'price'))
# trading logic
if not context.has_ordered:
# placing order, negative number for sale/short
order(symbol(selected_stock), n_stocks_to_buy)
# setting up a flag for holding a position
context.has_ordered = True
error:
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
<ipython-input-10-d11030c42e8b> in <module>()
----> 1 get_ipython().run_cell_magic('zipline', '--start 2016-1-1 --end 2017-12-31 --capital-base 1050.0 -o buy_and_hold.pkl', "\n# imports\nfrom zipline.api import order, symbol, record\n\n# parameters\nselected_stock = 'AAPL'\nn_stocks_to_buy = 10\n\ndef initialize(context):\n context.has_ordered = False \n\ndef handle_data(context, data):\n # record price for further inspection\n record(price=data.current(symbol(selected_stock), 'price'))\n \n # trading logic\n if not context.has_ordered:\n # placing order, negative number for sale/short\n order(symbol(selected_stock), n_stocks_to_buy)\n # setting up a flag for holding a position\n context.has_ordered = True")
~/anaconda3/envs/py36/lib/python3.5/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2165 magic_arg_s = self.var_expand(line, stack_depth)
2166 with self.builtin_trap:
-> 2167 result = fn(magic_arg_s, cell)
2168 return result
2169
~/anaconda3/envs/py36/lib/python3.5/site-packages/zipline/__main__.py in zipline_magic(line, cell)
309 '%s%%zipline' % ((cell or '') and '%'),
310 # don't use system exit and propogate errors to the caller
--> 311 standalone_mode=False,
312 )
313 except SystemExit as e:
~/anaconda3/envs/py36/lib/python3.5/site-packages/click/core.py in main(self, args, prog_name, complete_var, standalone_mode, **extra)
780 try:
781 with self.make_context(prog_name, args, **extra) as ctx:
--> 782 rv = self.invoke(ctx)
783 if not standalone_mode:
784 return rv
~/anaconda3/envs/py36/lib/python3.5/site-packages/click/core.py in invoke(self, ctx)
1064 _maybe_show_deprecated_notice(self)
1065 if self.callback is not None:
-> 1066 return ctx.invoke(self.callback, **ctx.params)
1067
1068
~/anaconda3/envs/py36/lib/python3.5/site-packages/click/core.py in invoke(*args, **kwargs)
608 with augment_usage_errors(self):
609 with ctx:
--> 610 return callback(*args, **kwargs)
611
612 def forward(*args, **kwargs): # noqa: B902
~/anaconda3/envs/py36/lib/python3.5/site-packages/click/decorators.py in new_func(*args, **kwargs)
19
20 def new_func(*args, **kwargs):
---> 21 return f(get_current_context(), *args, **kwargs)
22
23 return update_wrapper(new_func, f)
~/anaconda3/envs/py36/lib/python3.5/site-packages/zipline/__main__.py in run(ctx, algofile, algotext, define, data_frequency, capital_base, bundle, bundle_timestamp, start, end, output, trading_calendar, print_algo, metrics_set, local_namespace, blotter)
274 local_namespace=local_namespace,
275 environ=os.environ,
--> 276 blotter=blotter,
277 )
278
~/anaconda3/envs/py36/lib/python3.5/site-packages/zipline/utils/run_algo.py in _run(handle_data, initialize, before_trading_start, analyze, algofile, algotext, defines, data_frequency, capital_base, data, bundle, bundle_timestamp, start, end, output, trading_calendar, print_algo, metrics_set, local_namespace, environ, blotter)
157 trading_calendar=trading_calendar,
158 trading_day=trading_calendar.day,
--> 159 trading_days=trading_calendar.schedule[start:end].index,
160 )
161 first_trading_day =\
~/anaconda3/envs/py36/lib/python3.5/site-packages/zipline/finance/trading.py in __init__(self, load, bm_symbol, exchange_tz, trading_calendar, trading_day, trading_days, asset_db_path, future_chain_predicates, environ)
101 trading_day,
102 trading_days,
--> 103 self.bm_symbol,
104 )
105
~/anaconda3/envs/py36/lib/python3.5/site-packages/zipline/data/loader.py in load_market_data(trading_day, trading_days, bm_symbol, environ)
147 # date so that we can compute returns for the first date.
148 trading_day,
--> 149 environ,
150 )
151 tc = ensure_treasury_data(
~/anaconda3/envs/py36/lib/python3.5/site-packages/zipline/data/loader.py in ensure_benchmark_data(symbol, first_date, last_date, now, trading_day, environ)
214
215 try:
--> 216 data = get_benchmark_returns(symbol)
217 data.to_csv(get_data_filepath(filename, environ))
218 except (OSError, IOError, HTTPError):
~/anaconda3/envs/py36/lib/python3.5/site-packages/zipline/data/benchmarks.py in get_benchmark_returns(symbol)
33 'https://api.iextrading.com/1.0/stock/{}/chart/5y'.format(symbol)
34 )
---> 35 data = r.json()
36
37 df = pd.DataFrame(data)
~/anaconda3/envs/py36/lib/python3.5/site-packages/requests/models.py in json(self, **kwargs)
894 # used.
895 pass
--> 896 return complexjson.loads(self.text, **kwargs)
897
898 #property
~/anaconda3/envs/py36/lib/python3.5/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
317 parse_int is None and parse_float is None and
318 parse_constant is None and object_pairs_hook is None and not kw):
--> 319 return _default_decoder.decode(s)
320 if cls is None:
321 cls = JSONDecoder
~/anaconda3/envs/py36/lib/python3.5/json/decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):
~/anaconda3/envs/py36/lib/python3.5/json/decoder.py in raw_decode(self, s, idx)
355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
--> 357 raise JSONDecodeError("Expecting value", s, err.value) from None
358 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)

this post solve my problem. only needed the first two steps listed in it. you need to update the benchmarks.py and loader.py files in zipline/data
Getting JSONDecodeError: Expecting value: line 1 column 1 (char 0) with Python + Zipline + Docker + Jupyter

RuntimeError: cuda runtime error (710) : device-side assert triggered at

Traning the image classification with pytorch got following
error messageK
RuntimeError Traceback (most recent call
last) in
29 print(len(train_loader.dataset),len(valid_loader.dataset))
30 #break
---> 31 train_loss, train_acc ,model= train(model, device, train_loader, optimizer, criterion)
32 valid_loss, valid_acc,model = evaluate(model, device, valid_loader, criterion)
33
in train(model, device, iterator,
optimizer, criterion)
21 acc = calculate_accuracy(fx, y)
22 #print("5.")
---> 23 loss.backward()
24
25 optimizer.step()
~/venv/lib/python3.7/site-packages/torch/tensor.py in backward(self,
gradient, retain_graph, create_graph)
164 products. Defaults to False.
165 """
--> 166 torch.autograd.backward(self, gradient, retain_graph, create_graph)
167
168 def register_hook(self, hook):
~/venv/lib/python3.7/site-packages/torch/autograd/init.py in
backward(tensors, grad_tensors, retain_graph, create_graph,
grad_variables)
97 Variable._execution_engine.run_backward(
98 tensors, grad_tensors, retain_graph, create_graph,
---> 99 allow_unreachable=True) # allow_unreachable flag
100
101
RuntimeError: cuda runtime error (710) : device-side assert triggered
at /pytorch/aten/src/THC/generic/THCTensorMath.cu:26
Related code block is here
def train(model, device, iterator, optimizer, criterion):
print('train')
epoch_loss = 0
epoch_acc = 0
model.train()
for (x, y) in iterator:
#print(x,y)
x,y = x.cuda(), y.cuda()
#x = x.to(device)
#y = y.to(device)
#print('1')
optimizer.zero_grad()
#print('2')
fx = model(x)
#print('3')
loss = criterion(fx, y)
#print("4.loss->",loss)
acc = calculate_accuracy(fx, y)
#print("5.")
loss.backward()
optimizer.step()
epoch_loss += loss.item()
epoch_acc += acc.item()
return epoch_loss / len(iterator), epoch_acc / len(iterator),model
EPOCHS = 5
SAVE_DIR = 'models'
MODEL_SAVE_PATH = os.path.join(SAVE_DIR, 'please.pt')
from torch.utils.data import DataLoader
best_valid_loss = float('inf')
if not os.path.isdir(f'{SAVE_DIR}'):
os.makedirs(f'{SAVE_DIR}')
print("start")
for epoch in range(EPOCHS):
print('================================',epoch ,'================================')
for i , (train_idx, valid_idx) in enumerate(zip(train_indexes, valid_indexes)):
print(i,train_idx,valid_idx,len(train_idx),len(valid_idx))
traindf = df_train.iloc[train_index, :].reset_index()
validdf = df_train.iloc[valid_index, :].reset_index()
#traindf = df_train
#validdf = df_train
train_dataset = TrainDataset(traindf, mode='train', transforms=data_transforms)
valid_dataset = TrainDataset(validdf, mode='valid', transforms=data_transforms)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
valid_loader = DataLoader(valid_dataset, batch_size=batch_size, shuffle=False)
print(len(train_loader.dataset),len(valid_loader.dataset))
#break
train_loss, train_acc ,model= train(model, device, train_loader, optimizer, criterion)
valid_loss, valid_acc,model = evaluate(model, device, valid_loader, criterion)
if valid_loss < best_valid_loss:
best_valid_loss = valid_loss
torch.save(model,MODEL_SAVE_PATH)
print(f'| Epoch: {epoch+1:02} | Train Loss: {train_loss:.3f} | Train Acc: {train_acc*100:05.2f}% | Val. Loss: {valid_loss:.3f} | Val. Acc: {valid_acc*100:05.2f}% |')
splits = zip(train_indexes, valid_indexes)
[ 3692 3696 3703 ... 30733 30734 30735] [ 0 1 2 ... 4028
4041 4046] [ 0 1 2 ... 30733 30734 30735] [3692 3696 3703
... 7986 7991 8005] [ 0 1 2 ... 30733 30734 30735] [ 7499
7500 7502 ... 11856 11858 11860] [ 0 1 2 ... 30733 30734
30735] [11239 11274 11280 ... 15711 15716 15720] [ 0 1 2
... 30733 30734 30735] [15045 15051 15053 ... 19448 19460 19474] [
0 1 2 ... 30733 30734 30735] [18919 18920 18926 ... 23392
23400 23402] [ 0 1 2 ... 30733 30734 30735] [22831 22835
22846 ... 27118 27120 27124] [ 0 1 2 ... 27118 27120 27124]
[26718 26721 26728 ... 30733 30734 30735]

What was your loss function?
I got this error too. My problem was a multi-class classification and I was using a crossEntropy loss.
As it say in the documentations, labels should be in the range [0, C-1] where C is number of classes.
But my labels were not in the range and when I used proper values for labels, Everything was ok.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Cuda Out of Memory Error even after reducing the batch size - deep-learning

Related

Not able to read csv, error: List index out of range

ValueError while using CTGAN library generating the sample

I have moved input data to GPU but still can not train the model

json decode issue backtesting with zipline

RuntimeError: cuda runtime error (710) : device-side assert triggered at

Categories

Resources