Why are metrics in AllenNLP calculated with tensors? Can I define a metric based on strings? - allennlp

I'm using AllenNLP in my project, and I'm confused by the Metric: all of the metrics are calculated with tensors, include bleu and rouge. However sometime I may want to calculate the metric with strings tokenized with white spaces. The built-in metrics are calculated in a token level tokenized by BertTokenizer, and it may have a different result because of the difference of tokenization.
Currently I'm converting the tensor to tokens, putting them into a string and calculate my defined Metric in forward. The code is working now, but I wonder this may not be the right way.

What does not seem to be exactly the AllenNLP way of calculating metrics is getting them in the forward call. AllenNLP train_loop features a special blueprint function get_metrics for this.
However, if you mean that you update your metrics in forward and reset them in get_metrics, there seems to be no better way to do it.

Related

Using binary outcome variable and predictors with geeglm and gee in R?

When I use glm to perform logistic regression, first I read my binary outcome and predictor variables from a CSV file and convert them to factors (they are read in as ints). But if I do that before using gee or geeglm for clustering, geeglm gives me an error message, and although gee doesn't give me an error, the documentation says that you must use continuous predictors. It seems like I get sensical results if I just read in my binary variables and leave them as ints and do not convert them to factors. Is this an ok thing to do, and are there any concerns I need to be aware of? I specify binomial, and they are both for generalized linear models, so I'm not sure what's so wrong with passing factors.
Relatedly, in the scenario I'm describing, do I need to specify scale.fix=TRUE?
Thanks in advance!

LSTM Evolution Forecast

I have a confusion about the way the LSTM networks work when forecasting with an horizon that is not finite but I'm rather searching for a prediction in whatever time in future. In physical terms I would call it the evolution of the system.
Suppose I have a time series $y(t)$ (output) I want to forecast, and some external inputs $u_1(t), u_2(t),\cdots u_N(t)$ on which the series $y(t)$ depends.
It's common to use the lagged value of the output $y(t)$ as input for the network, such that I schematically have something like (let's consider for simplicity just lag 1 for the output and no lag for the external input):
[y(t-1), u_1(t), u_2(t),\cdots u_N(t)] \to y(t)
In this way of thinking the network, when one wants to do recursive forecast it is forced to use the predicted value at the previous step as input for the next step. In this way we have an effect of propagation of error that makes the long term forecast badly behaving.
Now, my confusion is, I'm thinking as a RNN as a kind of an (simple version) implementation of a state space model where I have the inputs, my output and one or more state variable responsible for the memory of the system. These variables are hidden and not observed.
So now the question, if there is this kind of variable taking already into account previous states of the system why would I need to use the lagged output value as input of my network/model ?
Getting rid of this does my long term forecast would be better, since I'm not expecting anymore the propagation of the error of the forecasted output. (I guess there will be anyway an error in the internal state propagating)
Thanks !
Please see DeepAR - a LSTM forecaster more than one step into the future.
The main contributions of the paper are twofold: (1) we propose an RNN
architecture for probabilistic forecasting, incorporating a negative
Binomial likelihood for count data as well as special treatment for
the case when the magnitudes of the time series vary widely; (2) we
demonstrate empirically on several real-world data sets that this
model produces accurate probabilistic forecasts across a range of
input characteristics, thus showing that modern deep learning-based
approaches can effective address the probabilistic forecasting
problem, which is in contrast to common belief in the field and the
mixed results
In this paper, they forecast multiple steps into the future, to negate exactly what you state here which is the error propagation.
Skipping several steps allows to get more accurate predictions, further into the future.
One more thing done in this paper is predicting percentiles, and interpolating, rather than predicting the value directly. This adds stability, and an error assessment.
Disclaimer - I read an older version of this paper.

how to predict query topics using word-topic matrix?

I'm implementing LDA using Java. I know how the algorithm works. In the end of the training (the given iterations) I will get 2 matrices (topic-word and document-topic) that represent the set of the input documents.
My problem is that when I input a new document (query) I want to use these matrices (or any other way) to get the document-topic vector of that query. How would I do that?
Are you using Variational Inference or Gibbs Sampling?
For Gibbs Sampling a typical approach is adding the new document/s to the inference, and only updating its own counters, keeping constant the counters for the documents you used to learn the model.
This is specified in equations 84 and 85 in Parameter Estimation for Text Analysis
I guess there has to be a similar approach in VI LDA.

Emulating IBM floating point multiplication/addition in VBA

I am attempting to emulate a (no longer existing) mainframe report generator in an Access 2003 or Access 2010 environment. The data it generates must match exactly with paper reports from the early 70s. Unfortunately, the earliest years data were run on hardware that used IBM floating point representation instead of IEEE. With the help of Google, I've found a library of VBA functions that will convert a float from decimal to the IEEE 754 32bit binary format. I had to modify the library to accept either 32bit or 64bit floats, so I have a modest working knowledge of floating point formats, however, I'm having trouble making the conversion from IEEE to IBM binary format, as well as trouble multiplying and adding either the IBM or the IEEE numbers.
I haven't turned up any other libraries for performing this conversion and arithmetic operations in VBA - is there an easier way to go about this, or an existing library that I'm not finding? Failing that, a clear and straightforward explanation of the relevant algorithms?
Thanks in advance.
To be honest you'd probably do better to start by looking at the Hercules emulator.
http://www.hercules-390.org/ Other than that in theory with VBA you can use the Decimal type to get good results (note you have to CDec to create these) it uses 12 bits with a variable power of ten scalar.
A quick google shows this post from the hercules group, which confirms Alberts point about needing to know the hardware:
---Snip--
In theory, but rather less so in practice. S/360 and S/370 had a
choice of Scientific or Commercial instruction sets. The former added
the FP instructions and registers to the base; the latter the decimal
instructions, including Edit and Edit & Mark. But larger 360 (iirc /65
and up) and 370 (/155 and up) models had the union of the two, called
the Universal instruction set, and at some point the S/370 dropped the
option.
---snip---
I have to say that having looked at the hercules source code you'll probably need to figure out exactly which floating point operation codes (in terms of precision single,long, extended) are being performed.
The problem is here's your confusing the issue of decimal type in access, and that of single and double type floating point values available in access.
If you use the currency data type in access, this is a scaled integer, and will not produce rounding (that is what most of us use for financial calculations and reports). You can also use decimal values in access, and again they don't round at all as they are packed decimals.
However, both the single and double values available inside of access are in fact the same format and conform to the IEEE floating point standard.
For an access single variable, this is a 32bit number, and the range is:
-3.402823E38
to
-1.401298E-45 for negative values
and
1.401298E-45
to
3.402823E38 for positive values
That looks to be the same to me as the IEEE 754 standard.
So, if you add up values in access as a single, you should get the rouding same results.
So, Intel based, and Access single and doubles I believe are the same as this IEEE standard.
The only real issue it and here is what is the format of the original data you're pulling into access, and what kinds of text or string or conversion process is occurring when that data is pulled in and stored?
Access can convert numbers. Try typing these values at the access command line prompt (debug window)
? hex(255)
Above will show FF
? csng(&hFF)
Above will show 255
Edit:
Ah, ok, I see now I have this reversed, my wrong here. The problem here is assuming you convert a number to the older IBM format (Excess 64?), you will THEN have to get your hands on their code that they used for adding those numbers. In fact, even back then, different IBM models depending on what you purchased actually produced different results (more money = more precision).
So, not only do you need conversion routines to convert to the internal representation, you THEN need the routines that add/subtract/multiply those numbers. So, just having conversion routines is not going to get you very far, since you also have to duplicate their exact routines that do math. Those types of routines are likely not all created equal in terms of how they round numbers etc.

2D non-polynomial function fitting from the command line

I just wrote a simple Unix command line utility that could be implemented a lot more efficiently. I can measure its performance by just running it on a number of inputs and measuring the time it takes. This will produce a set of pairs of numbers, s t, where s is the input size and t the processing time. In order to determine the performance characteristics of my utility, I need to fit a function through these data points. I can do this manually, but I prefer to be lazy and let a utility do it for me.
Does such a utility exist?
Its input is a sequence of pairs of numbers.
Its output is a formula that expresses how the second number depends as a function on the first, plus an error measure.
One step of the way is to have a utility that does this just for polynomials.
This has been discussed here but it didn't produce a ready-to-use solution.
The next step is to extend the utility to try non-polynomial terms: negative-degree polynomials (as in y = 1/x) and logarithmic terms (as in y = x log x) will need to be tried as well. One idea to cope with the non-polynomial terms is to just surround the polynomial fitting with x and y scale transformations. I don't know whether that will do. This question is related but not exactly the same.
As I said, I'm lazy: I'm not looking for ideas on how to to write this myself, I'm looking for a reliable result of a project that has already done it for me. Any suggestions?
I believe that SAS has this, RS/1 has this, I think that Mathematica has this, Execel and most spreadsheets have a primitive form of this and usually there are add-ons available for more advanced forms. There are lots of Lab analysis and Statistical analysis tools that have stuff like this.
RE., Command Line Tools:
SAS, RS/1 and Minitab were all command line tools 20 years ago when I used them. I bet at least one of them still has this capability.