What is the term for the conceptual distance between ordered nodes? - terminology

What is the name for the ordered relation between nodes?
For example: A color ontology represented in a trie has ordered color objects such that the marginal node between yellow and blue is green, and node between blue and green is teal, etc. I called this indexical.
I found that the term indexical is owned by linguistics (en.wikipedia.org/wiki/Indexicality). I had used the term indexical in academic presentations with a civil engineering audience that has more of a computer science awareness than usual -- nobody questioned my definition.
Searching online I found 'edit distance' and 'ordered.' Neither has the meaning I want.
In my presentation, I use the spork, the spoon and fork, as an example of a marginal object that requires a new node between spoon and fork (www.youtube.com/watch?v=ruJ76-o5lxU).
A broad example: Take every product in a grocery store and line them up, arrange those items with a closeness that represents their similarity. So, oranges and apples will be closer together than beef and fish. Which will both be closer together than to paper towels.
EDIT1: Revised the examples to any position between two points.
EDIT2: Simplified question.

Related

Training conversations using sequence models

I have a question regarding training conversations, the context is that the next statement is not necessarily a function of the previous statement but also of any statement in the body of conversation for example:
person1: what is your favorite food and restaurant
person 2: my favorite food is burger and McDonald is my fav restaurant
Person1: why do you like burger
person 2: because i dont care about the health aspect while eating
person1: why do you like mcdonalds when there are so many places where you can buy a burger
now as we can see the last question was derived from an answer received 3 steps before...
In this context how do i train an lstm so that it remembers all the previous contexts..
essentially i am looking for an approach to create my training data and output sentence..
Not sure that LSTM will give you sufficient long-term memory for the example conversation you posted. You will likely need to use some sort of transformer memory network to maintain context. Take a look at approaches for the "persona chat" problem as well as this recent paper on handling conversational context.

Building an autonomic drugs widget for medical education

I've made my way over to this community because I'm planning on building a widget to help medical students with understanding the effects of various autonomic medications on cardiovascular metrics like heart rate (HR), BP (systolic, diastolic, and mean) and peripheral resistance (SVR). Some background - I'm a 3rd year med student in the US without fluency in any programming languages (which makes this particularly difficult), but am willing to spend the time to pick up what I need to know to make this happen.
Regarding the project:
The effects of autonomic medications like epinephrine, norepinephrine, beta-blockers, and alpha-blockers on the cardiovascular system is of great interest to physicians because these drugs can be used to resuscitate, to prep for surgery, to slow the progression of cardiovascular and respiratory disease, and even as antidotes for certain toxicities. There are four receptor types we are primarily concerned with - alpha1, alpha2, beta1, beta2. The receptor selectivity profile of any given drug is what governs its effects on the CV system. The way these effects are taught and tested in med school classrooms and by the United States board exams is in the form of graphs.
The impetus for this project is that me and many of my classmates struggled with this concept when we were initially learning it, and I believe a large part of that arises from the lack of a resource which shows the changes in the graphs from baseline, in real time.
When being taught this info, we are required to consider: a) the downstream effects when the receptor types listed above are stimulated (by an agonist drug) or inhibited (by an antagonist); b) the receptor specificities of each of the autonomic drugs of interest (there are about 8 that are very important); c) how to interpret the graphs shown above and how those graphs would change if multiple autonomics were administered in succession. (Exams and the boards love to show graphs with various points marked along it, then ask which drugs are responsible for the changes seen, just like the example above.)
The current methods of learning these three points is a mess, and having gone through it, I'd like to do what I can to contribute to building a more effective resource.
My goal is to create a widget that allows a user to visualize these changes with up to 3 drugs in succession. Here is a rough sketch of the goal.
In this example, norepinephrine has strong alpha1 agonist effects which causes an increase in systolic (blue line), diastolic (red line), and mean BP, as well as peripheral resistance. Due to the increased BP, there is a reflexive decrease in HR.
Upon the administration of phentolamine, a strong alpha1 antagonist, the BP and SVR decline while HR increases reflexively.
Regarding the widget, I would like the user to be able to choose up to 3 drugs from a drop down menu (eg. Drug 1, Drug 2, Drug 3), and the graphs to reflect the effects of those drugs on the CV metrics while ALSO taking into account the interactions of the drugs with themselves.
This is an IMPORTANT point: the order in which drugs are added is important because certain receptors become blocked, preventing other drugs from having their primary effect so they revert to their secondary effect.
If you're still following me on this, what I'm looking for is some help in figuring out how best to approach all the possibilities that can happen. Should I try to understand if-then statements and write a script to produce graphs based off those? (eg. if epi, then Psys = x, Pdia = y, MAP = z). Should I create a contingency table in excel in which I list the 8 drugs I'm focusing on and make values for the metrics and then plot those, essentially taking into account all the permutations? Any thoughts and direction would be greatly appreciated.
Thank you for your time.

An entity with different attributes depending on its type?

Trying to design an ER diagram from a given brief for a university project.
I'm confused how I should handle this problem:
The items sold in the Food Truck can be of different types: burritos and
beverages. Every item have an ID, a description and a price. Assume that every
Food Truck has infinite stock of each item (i.e. we do not need to track stock levels
in each Food Truck).
All Burritos come with rice, a type of bean, a filling, and a set of optional
toppings. Burritos are priced by size (Mini, Regular, and Grande). Bean types will
vary. Chipp will start by offering two types: black beans and red beans. Burrito
fillings will vary (depending on the season). There are at least 3 types of Burrito
fillings and there should be a vegetarian option.
A Burrito may optionally have toppings: lettuce, tomato, and mild and hot
salsa. Toppings are free, but Chipp will also offer guacamole as a topping for which
there is an extra charge.
The Food Truck also sells different types of refreshing beverages, both
alcoholic and non-alcoholic. All beverages have a size measured in milliliters (just in
case Chipp takes his Food Truck business over the Channel to mainland Europe).
The solution I have got to so far is by making two weak entities, both with a relationship, like this:
Is this the correct way to handle the problem?
Chen's original notation had no symbols for subtyping. A weak entity set without a weak key produced the same result. Your approach is correct within that framework. However, in the same original notation, weak entity sets were associated with identifying relationships (double-bordered diamond) and total participation was indicated with a double line between the entity set and relationship, rather than the (min,max) style of cardinality indicator. This isn't a recommendation to stick to the original notation, but it may be a good idea to verify your answer against your textbook on these points.
A number of different extension notations have been developed to represent subtyping, and to indicate disjointness, which the original notation couldn't. If any of these are covered in your curriculum, I suggest you use them as they're more expressive.
Note also the extra charge requirement on guacamole, which your diagram doesn't include yet. Finally, you indicated price as a derived attribute of item, but I don't see any other attributes it could be calculated from.

Methods for automated synonym detection

I am currently working on a neural network based approach to short document classification, and since the corpuses I am working with are usually around ten words, the standard statistical document classification methods are of limited use. Due to this fact I am attempting to implement some form of automated synonym detection for the matches provided in the training. My question more specifically is about resolving a situation as follows:
Say I have classifications of "Involving Food", and one of "Involving Spheres" and a data set as follows:
"Eating Apples"(Food);"Eating Marbles"(Spheres); "Eating Oranges"(Food, Spheres);
"Throwing Baseballs(Spheres)";"Throwing Apples(Food)";"Throwing Balls(Spheres)";
"Spinning Apples"(Food);"Spinning Baseballs";
I am looking for an incremental method that would move towards the following linkages:
Eating --> Food
Apples --> Food
Marbles --> Spheres
Oranges --> Food, Spheres
Throwing --> Spheres
Baseballs --> Spheres
Balls --> Spheres
Spinning --> Neutral
Involving --> Neutral
I do realize that in this specific case these might be slightly suspect matches, but it illustrates the problems I am having. My general thoughts were that if I incremented a word for appearing opposite the words in a category, but in that case I would end up incidentally linking everything to the word "Involving", I then thought that I would simply decrement a word for appearing in conjunction with multiple synonyms, or with non-synonyms, but then I would lose the link between "Eating" and "Food". Does anyone have any clue as to how I would put together an algorithm that would move me in the directions indicated above?
There is an unsupervized boot-strapping approach that was explained to me to do this.
There are different ways of applying this approach, and variants, but here's a simplified version.
Concept:
Start by a assuming that if two words are synonyms, then in your corpus they will appear in similar settings. (eating grapes, eating sandwich, etc.)
(In this variant I will use co-occurence as the setting).
Boot-Strapping Algorithm:
We have two lists,
one list will contain the words that co-occur with food items
one list will contain the words that are food items
Supervized Part
Start by seeding one of the lists, for instance I might write the word Apple on the food items list.
Now let the computer take over.
Unsupervized Parts
It will first find all words in the corpus that appear just before Apple, and sort them in order of most occuring.
Take the top two (or however many you want) and add them into the co-occur with food items list. For example, perhaps "eating" and "Delicious" are the top two.
Now use that list to find the next two top food words by ranking the words that appear to the right of each word in the list.
Continue this process expanding each list until you are happy with the results.
Once that's done
(you may need to manually remove some things from the lists as you go which are clearly wrong.)
Variants
This procedure can be made quite effective if you take into account the grammatical setting of the keywords.
Subj ate NounPhrase
NounPhrase are/is Moldy
The workers harvested the Apples.
subj verb Apples
That might imply harvested is an important verb for distinguishing foods.
Then look for other occurrences of subj harvested nounPhrase
You can expand this process to move words into categories, instead of a single category at each step.
My Source
This approach was used in a system developed at the University of Utah a few years back which was successful at compiling a decent list of weapon words, victim words, and place words by just looking at news articles.
An interesting approach, and had good results.
Not a neural network approach, but an intriguing methodology.
Edit:
the system at the University of Utah was called AutoSlog-TS, and a short slide about it can be seen here towards the end of the presentation. And a link to a paper about it here
You could try LDA which is unsupervised. There is a supervised version of LDA but I can't remember the name! Stanford parser will have the algorithm which you can play around with. I understand it's not the NN approach you are looking for. But if you are just looking to group information together LDA would seem appropriate, especially if you are looking for 'topics'
The code here (http://ronan.collobert.com/senna/) implements a neural network to perform a variety on NLP tasks. The page also links to a paper that describes one of the most successful approaches so far of applying convolutional neural nets to NLP tasks.
It is possible to modify their code to use the trained networks that they provide to classify sentences, but this may take more work than you were hoping for, and it can be tricky to correctly train neural networks.
I had a lot of success using a similar technique to classify biological sequences, but, in contrast to English language sentences, my sequences had only 20 possible symbols per position rather than 50-100k.
One interesting feature of their network that may be useful to you is their word embeddings. Word embeddings map individual words (each can be considered an indicator vector of length 100k) to real valued vectors of length 50. Euclidean distance between the embedded vectors should reflect semantic distance between words, so this could help you detect synonyms.
For a simpler approach WordNet (http://wordnet.princeton.edu/) provides lists of synonyms, but I have never used this myself.
I'm not sure if I misunderstand your question. Do you require the system to be able to reason based on your input data alone, or would it be acceptable to refer to an external dictionary?
If it is acceptable, I would recommend you to take a look at http://wordnet.princeton.edu/ which is a database of English word relationships. (It also exists for a few other languges.) These relationships include synonyms, antonyms, hyperonyms (which is what you really seem to be looking for, rather than synonyms), hyponyms, etc.
The hyperonym / hyponym relationship links more generic terms to more specific ones. The words "banana" and "orange" are hyponyms of "fruit"; it is a hyperonym of both. http://en.wikipedia.org/wiki/Hyponymy Of course, "orange" is ambiguous, and is also a hyponym of "color".
You asked for a method, but I can only point you to data. Even if this turns out to be useful, you will obviously need quite a bit of work to use it for your particular application. For one thing, how do you know when you have reached a suitable level of abstraction? Unless your input is hevily normalized, you will have a mix of generic and specific terms. Do you stop at "citrus","fruit", "plant", "animate", "concrete", or "noun"? (Sorry, just made up this particular hierarchy.) Still, hope this helps.

Area of overlap of two circular Gaussian functions

I am trying to write code that finds the overlap of between 3D shapes.
Each shape is defined by two intersecting normal distributions (one in the x direction, one in the y direction).
Do you have any suggestions of existing code that addresses this question or functions that I can utilize to build this code? Most of my programming experience has been in R, but I am open to solutions in other languages as well.
Thank you in advance for any suggestions and assistance!
The longer research context on this question: I am studying the use of acoustic space by insects. I want to know whether randomly assembled groups of insects would have calls that are more or less similar than we observe in natural communities (a randomization test). To do so, I need to randomly select insect species and calculate the similarity between their calls.
For each species, I have a mean and variance for two call characteristics that are approximately normally distributed. I would like to use these two call characteristics to build a 3D probability distribution for the species. I would then like to calculate the amount by which the PDF for one species overlaps with another.
Please accept my apologies if the question is not clear or appropriate for this forum.
I work in small molecule drug discovery, and I frequently use a program (ROCS, by OpenEye Scientific Software) based on algorithms that represent molecules as collections of spherical Gaussian functions and compute intersection volumes. You might look at the following references, as well as the ROCS documentation:
(1) Grant and Pickup, J. Phys. Chem. 1995, 99, 3503-3510
(2) Grant, Gallardo, and Pickup, J. Comp. Chem. 1996, 17, 1653-1666