Is it possible to continue training a CatBoostRanker model?
The function train has a parameter named init_model
but doesn't accept type CatBoostRanker.
Related
Is it possible to access the A2C total loss and whether the environment truncated or terminated within a custom callback?
I'd like to access truncated and terminated in _on_step. That would allow me to terminate training when the environment truncates, and also allow me to record training episode durations. I'd also like to be able to record total loss, I assume in _on_rollout_end?
You need to attach a callback that implements _on_step method that returns a bool by checking your env's variables. Something like this (I always check my env for being a VecEnv since it has a bit different way of accessing its variables in compare to non-vectorized one):
class StopOnTruncCallback(BaseCallback):
def __init__(self, verbose: int = 0):
super().__init__(verbose)
def _on_step(self):
return self._is_trunc()
def _is_trunc(self):
if isinstance(self.training_env, VecEnv):
return self.training_env.get_attr("truncated")[0]
else:
return self.training_env.truncated
I am trying to create a BentoML service for a CatBoostClassifier model that was trained using a column as a categorical feature. If i save the model and I try to make some predictions with the saved model (not as a BentoML service) all works as expected, but when I create the service using BentML I get an error
_catboost.CatBoostError: Bad value for num_feature[non_default_doc_idx=0,feature_idx=2]="Tertiary": Cannot convert 'b'Tertiary'' to float
The value is found in a column named 'road_type' and the model was trained using 'object' as the data type for the column.
If I try to give a float or an integer for the 'road_type' column I get the following error
_catboost.CatBoostError: catboost/libs/data/model_dataset_compatibility.cpp:53: Feature road_type is Categorical in model but marked different in the dataset
If someone has encountered the same issue and found a solution I would appreciate it. Thanks!
I have tried different approaches for saving the model or loading the model but unfortunately it did not worked.
You can try to explicitly pass the cat_features to the bentoml runner.
It would be something like this:
from catboost import Pool
runner = bentoml.catboost.get("bentoml_catboost_model:latest").to_runner()
cat_features = [2] # specify your cat_features indexes
prediction = runner.predict.run(Pool(input_data, cat_features=cat_features))
So, I have a model with some parameters that I need to train. And also, there are some parameters in the class Dataset(torch.utils.data.Dataset) which do some preprocessing. I need to train them as well with the model’s parameters. So, can you please let me know if what I am doing is correct:
params = list(model.parameters())
params.extend(list(Dataset.parameters()))
opt = torch.optim.Adam(params,lr=1e-4)
One more question. As Dataset class will generate both the train_ds and val_ds, I only need to train the parameters of the Dataset class when getting train_ds. And to get val_ds, I need to use the trained parameters of the Dataset. So, how do I create the train_ds and val_ds? Should I initially create both of them and then train the model with tarin_ds, and then create them again and use the val_ds for testing?
Dataset class does not have .parameters - it is used only to handle data.
Your model class is derived from nn.Module class that has .parameters and can be trained.
It seems like you need to add the trainable preprocessing to your model, rather than have it as part of your dataset class.
Let's say I train an autoencoder.
I want to freeze the parameters of the encoder for the training, so only the decoder trains.
I can do this using:
# assuming it's a single layer called 'encoder'
model.encoder.weights.data.requers_grad = False
Or I can pass only the decoder's parameters to the optimizer. Is there a difference?
The most practical way is to iterate through all parameters of the module you want to freeze and set required_grad to False. This gives you the flexibility to switch your modules on and off without having to initialize a new optimizer each time. You can do this using the parameters generator available on all nn.Modules:
for param in module.parameters():
param.requires_grad = False
This method is model agnostic since you don't have to worry whether your module contains multiple layers or sub-modules.
Alternatively, you can call the function nn.Module.requires_grad_ once as:
module.requires_grad_(False)
I trained a model in TensorFlow, and saved it on disk.
Now I want to load it from checkpoint and print the trained parameters.
Something like:
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
hidden_units=hidden_units,
warm_start_from=checkpoint_path)
print(parameters(classifier))
How do I do that?
I'm using tf version 1.14.
I think you can use these two methods get_variable_names() and get_variable_value() to retrieve the parameters in your classifier.
params = classifier.get_variable_names()
for p in params:
print(p, classifier.get_variable_value(p))