Try to run an NLP model with an Electra instead of a BERT model - deep-learning

I want to run the wl-coref model with an Electra model instead of a Bert model. However, I get an error message with the Electra model and can't find a hint in the Huggingface documentation on how to fix it.
I try different BERT models such like roberta-base, bert-base-german-cased or SpanBERT/spanbert-base-cased. All works.
But if I try an Electra model, like google/electra-base-discriminator or german-nlp-group/electra-base-german-uncased then it doesn't work.
The error that is displayed:
out, _ = self.bert(subwords_batches_tensor, attention_mask=torch.tensor(attention_mask, device=self.config.device))
ValueError: not enough values to unpack (expected 2, got 1)
And this is the method where the error comes from:_bertify in line 349.

Just remove the underscore _. ELECTRA does not return a pooling output like BERT or RoBerta:
from transformers import AutoTokenizer, AutoModel
def bla(model_id:str):
t = AutoTokenizer.from_pretrained(model_id)
m = AutoModel.from_pretrained(model_id)
print(m(**t("this is a test", return_tensors="pt")).keys())
bla("google/electra-base-discriminator")
bla("roberta-base")
Output:
odict_keys(['last_hidden_state'])
odict_keys(['last_hidden_state', 'pooler_output'])

Related

How can I apply the model that I built against the different data using stat models?

How can I apply the model that I built against different data using stat models?
For example here I use OLS to model the data in file1. I want to use the modelX against on file2 data. Is that possible?
modelX = sm.OLS(y~ a+b+c, data=file1).fit()
modelY = sm.modelX.apply(y~ a+b+c, data=file2)
tried
import statsmodels.api as sm
import seaborn as sns
mpg = sns.load_dataset("mpg")
model = sm.OLS(mpg.weight, mpg.mpg)
results = model.fit()
results.apply(mpg.weight, mpg.mpg)
error
AttributeError: 'OLSResults' object has no attribute 'apply'

CNN I'm trying to generate confusion Matrix and classification report for Multiclass classification of custom model. But values didn't seems correct

#Confusion Matrix
from sklearn.metrics import confusion_matrix
plt.figure(figsize=(16,9))
y_pred_labels = [ np.argmax(label) for label in predict ]
cm = confusion_matrix(test_set.classes, y_pred_labels)
# show cm
sns.heatmap(cm, annot=True, fmt='d',xticklabels=class_labels, yticklabels=class_labels)
from sklearn.metrics import classification_report
cr= classification_report(test_set.classes, y_pred_labels, target_names=class_labels)
print(cr)
[Load Data from directory](https://i.stack.imgur.com/p87gv.png)
[accuracy](https://i.stack.imgur.com/1dSab.png)
[evaluate](https://i.stack.imgur.com/LEV0X.png)
[predict](https://i.stack.imgur.com/Kiim2.png)
[cm and cr](https://i.stack.imgur.com/sQN9P.png)
[cr](https://i.stack.imgur.com/dMAaB.png)
[cm](https://i.stack.imgur.com/LzqcY.png)
Complete flow is as given screenshots.
Anyone Can find where is actual problem. How can i get the correct values in classification report? While in predictions values are correct by use of model. predict method and pass there a data set.

Getting alignment/attention during translation in OpenNMT-py

Does anyone know how to get the alignments weights when translating in Opennmt-py? Usually the only output are the resulting sentences and I have tried to find a debugging flag or similar for the attention weights. So far, I have been unsuccessful.
I'm not sure if this is a new feature, since I did not come across this when looking for alignments a few months back, but onmt seems to have added a flag -report_align to output word alignments along with the translation.
https://opennmt.net/OpenNMT-py/FAQ.html#raw-alignments-from-averaging-transformer-attention-heads
Excerpt from opennnmt.net -
Currently, we support producing word alignment while translating for Transformer based models. Using -report_align when calling translate.py will output the inferred alignments in Pharaoh format. Those alignments are computed from an argmax on the average of the attention heads of the second to last decoder layer.
You can get the attention matrices. Note that it is not the same as alignment which is a term from statistical (not neural) machine translation.
There is a thread on github discussing it. Here is a snippet from the discussion. When you get the translations from the mode, the attentions are in the attn field.
import onmt
import onmt.io
import onmt.translate
import onmt.ModelConstructor
from collections import namedtuple
# Load the model.
Opt = namedtuple('Opt', ['model', 'data_type', 'reuse_copy_attn', "gpu"])
opt = Opt("PATH_TO_SAVED_MODEL", "text", False, 0)
fields, model, model_opt = onmt.ModelConstructor.load_test_model(
opt, {"reuse_copy_attn" : False})
# Test data
data = onmt.io.build_dataset(
fields, "text", "PATH_TO_DATA", None, use_filter_pred=False)
data_iter = onmt.io.OrderedIterator(
dataset=data, device=0,
batch_size=1, train=False, sort=False,
sort_within_batch=True, shuffle=False)
# Translator
translator = onmt.translate.Translator(
model, fields, beam_size=5, n_best=1,
global_scorer=None, cuda=True)
builder = onmt.translate.TranslationBuilder(
data, translator.fields, 1, False, None)
batch = next(data_iter)
batch_data = translator.translate_batch(batch, data)
translations = builder.from_batch(batch_data)
translations[0].attn # <--- here are the attentions

Latex or HTML summary output table for vglm regression objects (VGAM)

I'm trying to get a latex or html output of the regression results of a VGAM model (in the example bellow it's a generalized ordinal logit). But the packages I know for this purpose do not work with a vglm object.
Here you can see a little toy example with the error messages I'm getting:
library(VGAM)
n <- 1000
x <- rnorm(n)
y <- ordered( rbinom(n, 3, prob=.5) )
ologit <- vglm(y ~ x,
family = cumulative(parallel = F , reverse = TRUE),
model=T)
library(stargazer)
stargazer(ologit)
Error in objects[[i]]$zelig.call : $ operator not defined for this S4 class
library(texreg)
htmlreg(ologit)
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘extract’ for signature ‘"vglm"’
library(memisc)
mtable(ologit)
Error in UseMethod("getSummary") : no applicable method for 'getSummary' applied to an object of class "c('vglm', 'vlm', 'vlmsmall')"
I just had the same problem. My first work around is to run the OLogit Regression with the polr function of the MASS package. The resulting objects are easily visualizable / summarizable by the usual packages (I recommend sjplot 's tab_model function for the table output!)
2nd Option is to craft your own table, which you then turn into a neat HTML object via stargazer.
For this you need to know that s4 objects are not subsettable in the same manner as conventional objects (http://adv-r.had.co.nz/Subsetting.html). The most straight forward solution is to subset the object, i.e. extract the relevant aspects with an # instead of a $ symbol:
sumobject <- summaryvglm(yourvglmobject)
stargazer(sumpbject#coef3, type="html", out = "RegDoc.doc")
A little cumbersome but it did the trick for me. Hope this helps!

Converting a theano model built on GPU to CPU?

I have some pickle files of deep learning models built on gpu. I'm trying to use them in production. But when i try to unpickle them on the server, i'm getting the following error.
Traceback (most recent call last):
File "score.py", line 30, in
model = (cPickle.load(file))
File "/usr/local/python2.7/lib/python2.7/site-packages/Theano-0.6.0-py2.7.egg/theano/sandbox/cuda/type.py", line 485, in CudaNdarray_unpickler
return cuda.CudaNdarray(npa)
AttributeError: ("'NoneType' object has no attribute 'CudaNdarray'", , (array([[ 0.011515 , 0.01171047, 0.10408644, ..., -0.0343636 ,
0.04944979, -0.06583775],
[-0.03771918, 0.080524 , -0.10609912, ..., 0.11019105,
-0.0570752 , 0.02100536],
[-0.03628891, -0.07109226, -0.00932018, ..., 0.04316209,
0.02817888, 0.05785328],
...,
[ 0.0703947 , -0.00172865, -0.05942701, ..., -0.00999349,
0.01624184, 0.09832744],
[-0.09029484, -0.11509365, -0.07193922, ..., 0.10658887,
0.17730837, 0.01104965],
[ 0.06659461, -0.02492988, 0.02271739, ..., -0.0646857 ,
0.03879852, 0.08779807]], dtype=float32),))
I checked for that cudaNdarray package in my local machine and it is not installed, but still i am able to unpickle them. But in the server, i am unable to. How do i make them to run on a server which doesnt have a GPU?
There is a script in pylearn2 which may do what you need:
pylearn2/scripts/gpu_pkl_to_cpu_pkl.py
The related Theano code is here.
From there, it looks like there is an option config.experimental.unpickle_gpu_on_cpu which you could set and which would make CudaNdarray_unpickler return the underlying raw Numpy array.
This works for me. Note: this doesn't work unless the following environment variable is set: export THEANO_FLAGS='device=cpu'
import os
from pylearn2.utils import serial
import pylearn2.config.yaml_parse as yaml_parse
if __name__=="__main__":
_, in_path, out_path = sys.argv
os.environ['THEANO_FLAGS']="device=cpu"
model = serial.load(in_path)
model2 = yaml_parse.load(model.yaml_src)
model2.set_param_values(model.get_param_values())
serial.save(out_path, model2)
I solved this problem by just saving the parameters W & b, but not the whole model. You can save the parameters use this:http://deeplearning.net/software/theano/tutorial/loading_and_saving.html?highlight=saving%20load#robust-serialization
This can save the CudaNdarray to numpy array. Then you need to read the params by numpy.load(), and finally convert the numpy array to tensorSharedVariable use theano.shared().