Huggingface TFRobertaForSequenceClassification.from_pretrained("cardiffnlp/twitter-roberta-base-emotion", num_labels=1) give ValueError - regression

Reproducible on google colab (transformers 4.24.0).
from transformers import TFAutoModelForSequenceClassification
model = TFRobertaForSequenceClassification.from_pretrained("cardiffnlp/twitter-roberta-base-emotion", num_labels=1)
I would like to set num_labels=1 because I would like to use it as a regression model. But the above code will give ValueError:
ValueError: cannot reshape array of size 3072 into shape (768,1)
I remembered doing this for the distillbert model worked. Is this the right way of calling this? Or this is not supported by HF for anything than the Bert families?

Related

Detectron2 models not generating any results

I am just trying out detectron2 with some basic code as follows
model = model_zoo.get('COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml', trained=True)
im = Image.open('input.jpg')
t = transforms.ToTensor()
model.eval()
with torch.no_grad():
im = t(im)
output = model([{'image':im}])
print(output)
However the model does not produce any meaningful predictions
[{'instances': Instances(num_instances=0, image_height=480, image_width=640, fields=[pred_boxes: Boxes(tensor([], device='cuda:0', size=(0, 4))), scores: tensor([], device='cuda:0'), pred_classes: tensor([], device='cuda:0', dtype=torch.int64)])}]
I don't quite get what went wrong, it was stated in the detectron2 documentation that:
You can also run inference directly like this:
model.eval()
with torch.no_grad():
outputs = model(inputs)
and
For inference of builtin models, only “image” key is required, and “width/height” are optional.
In which case, I can't seem to find the missing link here.
I had the same issue, for me I had two issues to fix. The first was resizing shortest edge. I used the Detectron2 built function from detectron2.data.transforms and imported ResizeShortestEdge. The model values can be found with cfg.INPUT, which will list max/min sizes for test and train. The other issue was matching the color channels with cfg.

AttributeError: 'collections.OrderedDict' object has no attribute 'predict'

Being a new guy and a beginner to deep learning and pytorch I am not sure what all inputs should I give you guys to answer my question. But I will try my best to make you guys understand my problem. I have loaded a model in pytorch using 'model= torch.load('model/resnet18-5c106cde.pth')'. But it is showing an AttributeError: 'collections.OrderedDict' object has no attribute 'predict', when I used the command 'prediction = model.predict(test_image)'. Hope you guys understood my problem and Thanks in advance...
I'd guess that the checkpoint you are loading stores a model state dict (the model's parameters) rather than a model (the structure of the model plus its parameters). Try:
model = resnet18(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
where PATH is the path to the model checkpoint. You need to declare model as an instance of the object class (declare the model structure) so that you can load the checkpoint (parameters only, no structure). So you'll need to find the appropriate class to import for the resnet18, probably something along the lines of:
from torchvision.models import resnet18

what is wrong when training an autoencoder on mnist dataset with caffe?

I want to use mnist dataset to train a simple autoencoder in caffe and with nvidia-digits.
I have:
caffe: 0.16.4
DIGITS: 5.1
python 2.7
I use the structure provided here:
https://github.com/BVLC/caffe/blob/master/examples/mnist/mnist_autoencoder.prototxt
Then I face 2 problems:
When I use the provided structure I get this error:
Traceback (most recent call last):
File "digits/scheduler.py", line 512, in run_task
task.run(resources)
File "digits/task.py", line 189, in run
self.before_run()
File "digits/model/tasks/caffe_train.py", line 220, in before_run
self.save_files_generic()
File "digits/model/tasks/caffe_train.py", line 665, in save_files_generic
'cannot specify two val image data layers'
AssertionError: cannot specify two val image data layers
when I remove the layer for ''test-on-test'', I get a bad result like this:
https://screenshots.firefox.com/8hwLmSmEP2CeiyQP/localhost
What is the problem??
The first problem occurs because the .prototxt has two layers with name data and TEST phase. The first layer that uses data, i.e. flatdata, does not know which data to use (the test-to-train or test-to-test). That's why when you remove one of the data layers with TEST phase, the error does not happen. Edit: I've checked the solver file and it has a test_stage parameter that should switch between the test files, but it's clearly not working in your case.
The second problem is a little more difficult to solve. My knowledge in autoencoders is limited. It seems your euclidean loss changes very little during your iterations; I would check the base learning rate in your solver.prototxt and decrease it. Check how the losses fluctuate.
Besides that, for the epochs/iterations that achieved a low error, have you checked the output data/images? Do they make sense?

Coefficient in support vector regression (SVR) using grid search (GridSearchCV) and Pipeline in Scikit Learn

I am having trouble to access the coefficients of a support vector regression model (SVR) in scikit learn when the model is embedded in a pipeline and a grid search.
Consider the following example:
from sklearn.datasets import load_iris
import numpy as np
from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVR
from sklearn.feature_selection import SelectKBest
from sklearn.pipeline import Pipeline
iris = load_iris()
X_train = iris.data
y_train = iris.target
clf = SVR(kernel='linear')
select = SelectKBest(k=2)
steps = [('feature_selection', select), ('svr', clf)]
pipeline = Pipeline(steps)
grid = GridSearchCV(pipeline, param_grid={"svr__C":[10,10,100],"svr__gamma": np.logspace(-2, 2)})
grid.fit(X_train, y_train)
This seems to work fine but when I try to access the coefficient of the best fitting model
grid.best_estimator_.coef_
I get an error message: AttributeError: 'Pipeline' object has no attribute 'coef_'.
I also tried to access the individual steps of the pipeline:
pipeline.named_steps['svr']
but could not find the coefficients there.
Just happened to come across the same problem and this post
had the answer:
grid.best_estimator_ contains an instance of the pipeline, which consists of steps. The last step should always be the estimator, so you should always find the coefficients at:
grid.best_estimator_.steps[-1][1].coef_

How do you export .caffemodels to other applications?

Is it possible to translate the info in a .caffemodel file such that it could be read by (for example) Matlab. That is, is there a way to write your model using something else that prototxt and import the weights trained using Caffe?
If the answer is "Nope, it's a binary file and will always remain that way", is there some documentation regarding the structure of the file so that one could extract the important information somehow?
As you know, .caffemodel consists of weights and biases.
A simple way to read weights and biases for a caffemodel given the prototxt would be to just load the network in Python and read the weights.
You can use:
import caffe
net = caffe.Net(<prototxt-file>,<model-file>,<phase>);
and access the params from net.params
source
I'll take VGG as an example
from caffe.proto import caffe_pb2
net = caffe_pb2.NetParameter()
caffemodel = sys.argv[1]
with open(caffemodel, 'rb') as f:
net.ParseFromString(f.read())
for i in net.layer:
print i.ListFields()[0][-1]
#conv1
#relu1
#norm1
#pool1
#conv2
#relu2
#norm2
#pool2
#conv3
#relu3
#conv4
#relu4
#conv5
#relu5
#pool5
#fc6
#relu6
#drop6
#fc7
#relu7
#drop7
#fc8
#prob