How do I get the definition for a sense in NLTK's senseval module? - nltk

In the NLTK senseval module, senses are of the form HARD1, HARD2, etc. (see source here). However, there doesn't seem to be a way to get the actual definition. I'm trying to implement the Lesk algorithm, and I'm now attempting to check whether the sense predicted by the Lesk algorithm is correct (using a definition from WordNet).
The problem I'm running into is how to unify the WordNet definition with the senseval answer (HARD1, HARD2). Does anybody know how to translate the SENSEVAL sense into a definition, or look it up somewhere?

I ended up finding out that these correspond to the senses in WordNet 1.7, which is pretty archaic (doesn't seem easily installable on Mac OS X or Ubuntu 11.04).
There are no online versions of WordNet 1.7 that I could find.
This site also has some useful information about these three corpora. For example, it says that the six senses of interest were taken from the Longman English Dictionary Online (circa 2001). See here
It describes the source of HARD as WordNet 1.7.
Ultimately, I ended up manually mapping the definitions to those in WordNet 3.0. If you're interested, here's the dictionary. Note, however, that I'm not an expert on linguistics, and they're not exact
# A map of SENSEVAL senses to WordNet 3.0 senses.
# SENSEVAL-2 uses WordNet 1.7, which is no longer installable on most modern
# machines and is not the version that the NLTK comes with.
# As a consequence, we have to manually map the following
# senses to their equivalent(s).
SV_SENSE_MAP = {
"HARD1": ["difficult.a.01"], # not easy, requiring great physical or mental
"HARD2": ["hard.a.02", # dispassionate
"difficult.a.01"],
"HARD3": ["hard.a.03"], # resisting weight or pressure
"interest_1": ["interest.n.01"], # readiness to give attention
"interest_2": ["interest.n.03"], # quality of causing attention to be given to
"interest_3": ["pastime.n.01"], # activity, etc. that one gives attention to
"interest_4": ["sake.n.01"], # advantage, advancement or favor
"interest_5": ["interest.n.05"], # a share in a company or business
"interest_6": ["interest.n.04"], # money paid for the use of money
"cord": ["line.n.18"], # something (as a cord or rope) that is long and thin and flexible
"formation": ["line.n.01","line.n.03"], # a formation of people or things one beside another
"text": ["line.n.05"], # text consisting of a row of words written across a page or computer screen
"phone": ["telephone_line.n.02"], # a telephone connection
"product": ["line.n.22"], # a particular kind of product or merchandise
"division": ["line.n.29"], # a conceptual separation or distinction
"SERVE12": ["serve.v.02"], # do duty or hold offices; serve in a specific function
"SERVE10": ["serve.v.06"], # provide (usually but not necessarily food)
"SERVE2": ["serve.v.01"], # serve a purpose, role, or function
"SERVE6": ["service.v.01"] # be used by; as of a utility
}

Related

IR Offline Metric Precision#k in Topic Modeling

Good day everyone! I'm currently learning LDA and I curious on how to validate the result of the topic model.
I've read the statement on this paper https://arxiv.org/pdf/2107.02173.pdf p.10
To the extent that our experimentation accurately represents current practice, our results do suggest that topic model evaluation—both automated and human—is overdue for a careful reconsideration. In this, we agree with Doogan and Buntine (2021), who write that “coherence measures designed for older models [. . . ] may be incompatible with newer models” and instead argue for evaluation paradigms centered on corpus exploration and labeling. The right starting point for this reassessment is the recognition that both automated and human evaluations are abstractions of a real-world problem. The familiar use of precision-at-10 in information retrieval, for example, corresponds to a user who is only willing to consider the top ten retrieved documents.
Suppose you have a corpus of a play store app user reviews, after a topic model processes this corpus, let's say it generates K=10 topics. How are we gonna use precision#10 offline metric to evaluate the result(10 topics)?

How to predict if a phrase is related to a short text or an article using supervised learning?

I have set of short phrases and a set of texts. I want to predict if a phrase is related to an article. A phrase that isn't appearing in the article may still be related.
Some examples of annotated data (not real) is like this:
Example 1
Phrase: Automobile
Text: Among the more affordable options in the electric-vehicle marketplace, the 2021 Tesla Model 3 is without doubt the one with the
most name recognition. It borrows some styling cues from the company's
Model S sedan and Model X SUV, but goes its own way with a unique
interior design and an all-glass roof. Acceleration is quick, and the
Model 3's chassis is playful as well—especially the Performance
model's, which receives a sportier suspension and a track driving
mode. But EV buyers are more likely interested in driving range than
speediness or handling, and the Model 3 delivers there too. The base
model offers up to 263 miles of driving range according to the EPA,
and the more expensive Long Range model can go up to 353 per charge.
Label: Related (PS: For a given text, one and only one phrase is labeled 'Related' with it. All others are 'Unrelated')
Example 2
Phrase: Programming languages
Text: Python 3.9 uses a new parser, based on PEG instead of LL(1). The new parser’s performance is roughly comparable to that of the old
parser, but the PEG formalism is more flexible than LL(1) when it
comes to designing new language features. We’ll start using this
flexibility in Python 3.10 and later.
The ast module uses the new parser and produces the same AST as the
old parser.
In Python 3.10, the old parser will be deleted and so will all
functionality that depends on it (primarily the parser module, which
has long been deprecated). In Python 3.9 only, you can switch back to
the LL(1) parser using a command line switch (-X oldparser) or an
environment variable (PYTHONOLDPARSER=1).
Label: Related(i.e. all other phrases are 'Unrelated')
I think I may have to use, for example, pre-trained BERT, because this kind of prediction needs additional knowledge. But this does not seem like a standard classification problem so I can't find out-of-the-box codes. May I have some advice on how to combine existing wheels and train it?

What is differentiable programming?

Native support for differential programming has been added to Swift for the Swift for Tensorflow project. Julia has similar with Zygote.
What exactly is differentiable programming?
what does it enable? Wikipedia says
the programs can be differentiated throughout
but what does that mean?
how would one use it (e.g. a simple example)?
and how does it relate to automatic differentiation (the two seem conflated a lot of the time)?
I like to think about this question in terms of user-facing features (differentiable programming) vs implementation details (automatic differentiation).
From a user's perspective:
"Differentiable programming" is APIs for differentiation. An example is a def gradient(f) higher-order function for computing the gradient of f. These APIs may be first-class language features, or implemented in and provided by libraries.
"Automatic differentiation" is an implementation detail for automatically computing derivative functions. There are many techniques (e.g. source code transformation, operator overloading) and multiple modes (e.g. forward-mode, reverse-mode).
Explained in code:
def f(x):
return x * x * x
∇f = gradient(f)
print(∇f(4)) # 48.0
# Using the `gradient` API:
# ▶ differentiable programming.
# How `gradient` works to compute the gradient of `f`:
# ▶ automatic differentiation.
I never heard the term "differentiable programming" before reading your question, but having used the concepts noted in your references, both from the side of creating code to solve a derivative with Symbolic differentiation and with Automatic differentiation and having written interpreters and compilers, to me this just means that they have made the ability to calculate the numeric value of the derivative of a function easier. I don't know if they made it a First-class citizen, but the new way doesn't require the use of a function/method call; it is done with syntax and the compiler/interpreter hides the translation into calls.
If you look at the Zygote example it clearly shows the use of prime notation
julia> f(10), f'(10)
Most seasoned programmers would guess what I just noted because there was not a research paper explaining it. In other words it is just that obvious.
Another way to think about it is that if you have ever tried to calculate a derivative in a programming language you know how hard it can be at times and then ask yourself why don't they (the language designers and programmers) just add it into the language. In these cases they did.
What surprises me is how long it to took before derivatives became available via syntax instead of calls, but if you have ever worked with scientific code or coded neural networks at at that level then you will understand why this is a concept that is being touted as something of value.
Also I would not view this as another programming paradigm, but I am sure it will be added to the list.
How does it relate to automatic differentiation (the two seem conflated a lot of the time)?
In both cases that you referenced, they use automatic differentiation to calculate the derivative instead of using symbolic differentiation. I do not view differentiable programming and automatic differentiation as being two distinct sets, but instead that differentiable programming has a means of being implemented and the way they chose was to use automatic differentiation, they could have chose symbolic differentiation or some other means.
It seems you are trying to read more into what differential programming is than it really is. It is not a new way of programming, but just a nice feature added for doing derivatives.
Perhaps if they named it differentiable syntax it might have been more clear. The use of the word programming gives it more panache than I think it deserves.
EDIT
After skimming Swift Differentiable Programming Mega-Proposal and trying to compare that with the Julia example using Zygote, I would have to modify the answer into parts that talk about Zygote and then switch gears to talk about Swift. They each took a different path, but the commonality and bottom line is that the languages know something about differentiation which makes the job of coding them easier and hopefully produces less errors.
About the Wikipedia quote that
the programs can be differentiated throughout
At first reading it seems nonsense or at least lacks enough detail to understand it in context which is why I am sure you asked.
In having many years of digging into what others are trying to communicate, one learns that unless the source has been peer reviewed to take it with a grain of salt, and unless it is absolutely necessary to understand, then just ignore it. In this case if you ignore the sentence most of what your reference makes sense. However I take it that you want an answer, so let's try and figure out what it means.
The key word that has me perplexed is throughout, but since you note the statement came from Wikipedia and in Wikipedia they give three references for the statement, a search of the word throughout appears only in one
∂P: A Differentiable Programming System to Bridge Machine Learning and Scientific Computing
Thus, since our ∂P system does not require primitives to handle new
types, this means that almost all functions and types defined
throughout the language are automatically supported by Zygote, and
users can easily accelerate specific functions as they deem necessary.
So my take on this is that by going back to the source, e.g. the paper, you can better understand how that percolated up into Wikipedia, but it seems that the meaning was lost along the way.
In this case if you really want to know the meaning of that statement you should ask on the Wikipedia talk page and ask the author of the statement directly.
Also note that the paper referenced is not peer reviewed, so the statements in there may not have any meaning amongst peers at present. As I said, I would just ignore it and get on with writing wonderful code.
You can guess its definition by application of differentiability.
It's been used for optimization i.e. to calculate minimum value or maximum value
Many of these problems can be solved by finding the appropriate function and then using techniques to find the maximum or the minimum value required.

Boolean function, what is the purpose of DNF and CNF?

Boolean functions can be expressed in Disjunctive normal form (DNF) or Conjunctive normal form (CNF). Can anyone explain why these forms are useful?
CNF is useful because this form directly describes the Boolean SAT problem, which while NP-complete, has many incomplete and heuristic exponential time solvers. CNF has been further standardized into a file format called the "DIMACS CNF file format", from which most solvers can operate on. Thus for example, the chip industry can verify their circuit designs by converting to DIMACS CNF, and feeding in to any of the solvers available. The Tseitin Transformation can convert circuits to an equisatisfiable CNF.
DNF is not as useful practically, but converting a formula to DNF means one can see a list of possible assignments that would satisfy the formula. Unfortunately, converting a formula to DNF in general, is hard, and can lead to exponential blow up (very large DNF), which is evident, because there can be exponentially number of satisfying assignments to a formula.
While CNF can be succinct compared with DNF, it is sometimes hard to reason with, because it can lose structure when converted from a circuit for example, and so another succinct form would be useful. The and-inverter graph data structure was designed for this purpose; it can more closely model the structure of circuits, and is thus much easier to reason with in some types of formulas. However there are not many solvers that operate on it.
Other forms include:
Binary Decision Diagram
Propositional directed acyclic graph
Negation normal form
Others (wikipedia)
It helps to express the functions in some standard way. With that, it's easier to automatically go through many algorithms.
Both forms can be used, for example, in automated theorem solving, namely CNF in the resolution method.

PHP/Python/C/C++ library/application to match/correct/give suggestions to input

I'd like to have a simple & lightweight library/application in PHP/Python/C/C++ library/application to match/correct/give suggestions to input. Example in/out:
Input: Webdevelopment ==> Output: Web Development
Input: Web developmen ==> Output: Web Development
Input: Web develop ==> Output: Web Development
Given there is database of correct words and phrases, I just need the library to match/guess phrases. Please suggest if you know any.
How to Write a Spelling Corrector from Google's Director of Resarch Peter Norvik contains a spelling corrector in 21 lines of Python, complete with explanations.
You will have to convert this into a module yourself, but that should be easy. Of course, you will also need a corpus (i.e. words), but he gives sources for these as well.
I guess what you want to do is compute the edit distance between strings (an input, output pair).
One of the simpler ones (that I've used for figuring out a team's full name from it's 3 letter short one - it's a long story..) is the Levenshtein distance. The last external link on the page has a bunch of different implementations of it (turns out it's standard on PHP 4.0.1+).