Training conversations using sequence models - deep-learning

I have a question regarding training conversations, the context is that the next statement is not necessarily a function of the previous statement but also of any statement in the body of conversation for example:
person1: what is your favorite food and restaurant
person 2: my favorite food is burger and McDonald is my fav restaurant
Person1: why do you like burger
person 2: because i dont care about the health aspect while eating
person1: why do you like mcdonalds when there are so many places where you can buy a burger
now as we can see the last question was derived from an answer received 3 steps before...
In this context how do i train an lstm so that it remembers all the previous contexts..
essentially i am looking for an approach to create my training data and output sentence..

Not sure that LSTM will give you sufficient long-term memory for the example conversation you posted. You will likely need to use some sort of transformer memory network to maintain context. Take a look at approaches for the "persona chat" problem as well as this recent paper on handling conversational context.

Related

Variable actions in Deep Reinforcement Learning

I'm trying to teach an AI the combat mechanics of a system similar to Darkest Dungeon.
The goal is for the AI to be able to act well controlling NPCs with random stats and random skills. This means that on each session the AI's character will different values for health, stress, accuracy, dodge, etc... The stats of each skill the character has also has random values - the damage, accuracy, effects.
In my current system-
The inputs for the model are:
All the stats that the AI's character has.
All the stats of it's
allies.
A subset of stats of each enemy (in Darkest Dungeon you
cannot see all the enemies's stats).
All the stats of each skill.
The outputs for the model are:
Which skill to use (out of 4 options)
Which target (total of 8 options, being any ally (including self) and
any enemy)
I'm using an action mask to disable invalid actions (such as using an offensive skill on an ally or targeting a position that has no character on).
The main problem I'm having is that what each action does changes heavily depending on the stats of the skill in that index.
Does anyone have an insight on which kind of learning I'm looking for? So far I have tried using MA-POCA provided by the Unity Ml-Agents package with no success, the model didn't seem to understand that what each action does relies extremely on the associated skill stats.
Searching for papers on the subject only resulted in articles about action-spaces with variable size, which I already solve with by masking invalid actions.
Obs: I'm not limited to training in the Unity environment. The only limitation I have is that the model must be convertible/exported to ONNX format.

Is the reward related to previous state or next state?

In the reinforcement learning framework, I am a little bit confused about the reward and how it is related to states. For example, in Q-learning, we have the following formula for updating the Q table:
that means that the reward is obtained from the environment at the time t+1. I mean that after applying the action at, the environment gives st+1 and rt+1.
It is often true that the reward is associated with the previous time step, that is using rt in the above formula. See, for example the Wikipedia page for Q-learning (https://en.wikipedia.org/wiki/Q-learning). Why is this?
Accidentally, some Wikipedia pages about the same topic but in different languages, use rt+1 (or unexpectedly Rt+1). See, for example, the Italian and Japanese pages:
https://it.wikipedia.org/wiki/Q-learning
https://ja.wikipedia.org/wiki/Q%E5%AD%A6%E7%BF%92

Building an autonomic drugs widget for medical education

I've made my way over to this community because I'm planning on building a widget to help medical students with understanding the effects of various autonomic medications on cardiovascular metrics like heart rate (HR), BP (systolic, diastolic, and mean) and peripheral resistance (SVR). Some background - I'm a 3rd year med student in the US without fluency in any programming languages (which makes this particularly difficult), but am willing to spend the time to pick up what I need to know to make this happen.
Regarding the project:
The effects of autonomic medications like epinephrine, norepinephrine, beta-blockers, and alpha-blockers on the cardiovascular system is of great interest to physicians because these drugs can be used to resuscitate, to prep for surgery, to slow the progression of cardiovascular and respiratory disease, and even as antidotes for certain toxicities. There are four receptor types we are primarily concerned with - alpha1, alpha2, beta1, beta2. The receptor selectivity profile of any given drug is what governs its effects on the CV system. The way these effects are taught and tested in med school classrooms and by the United States board exams is in the form of graphs.
The impetus for this project is that me and many of my classmates struggled with this concept when we were initially learning it, and I believe a large part of that arises from the lack of a resource which shows the changes in the graphs from baseline, in real time.
When being taught this info, we are required to consider: a) the downstream effects when the receptor types listed above are stimulated (by an agonist drug) or inhibited (by an antagonist); b) the receptor specificities of each of the autonomic drugs of interest (there are about 8 that are very important); c) how to interpret the graphs shown above and how those graphs would change if multiple autonomics were administered in succession. (Exams and the boards love to show graphs with various points marked along it, then ask which drugs are responsible for the changes seen, just like the example above.)
The current methods of learning these three points is a mess, and having gone through it, I'd like to do what I can to contribute to building a more effective resource.
My goal is to create a widget that allows a user to visualize these changes with up to 3 drugs in succession. Here is a rough sketch of the goal.
In this example, norepinephrine has strong alpha1 agonist effects which causes an increase in systolic (blue line), diastolic (red line), and mean BP, as well as peripheral resistance. Due to the increased BP, there is a reflexive decrease in HR.
Upon the administration of phentolamine, a strong alpha1 antagonist, the BP and SVR decline while HR increases reflexively.
Regarding the widget, I would like the user to be able to choose up to 3 drugs from a drop down menu (eg. Drug 1, Drug 2, Drug 3), and the graphs to reflect the effects of those drugs on the CV metrics while ALSO taking into account the interactions of the drugs with themselves.
This is an IMPORTANT point: the order in which drugs are added is important because certain receptors become blocked, preventing other drugs from having their primary effect so they revert to their secondary effect.
If you're still following me on this, what I'm looking for is some help in figuring out how best to approach all the possibilities that can happen. Should I try to understand if-then statements and write a script to produce graphs based off those? (eg. if epi, then Psys = x, Pdia = y, MAP = z). Should I create a contingency table in excel in which I list the 8 drugs I'm focusing on and make values for the metrics and then plot those, essentially taking into account all the permutations? Any thoughts and direction would be greatly appreciated.
Thank you for your time.

Is It Efficient and Scalable for a Neural Network to Rely on Weights that Require Database Interaction?

I'm a high school senior interested in computer science and I have been programming for almost nine years now. I've recently become interested in machine learning and I have decided to implement a neural network. I haven't begun to code it yet and have been in the designing stage for a while now. The objective of the program is to analyze a student's paper, along with some other information, and then predict what grade the student will receive, much like PaperRater. However, I plan to make it far more personal than PaperRater.
The program has four inputs, one is the student's paper, the second is the student's id (i.e, primary key), third is the teacher's id, and finally the course id. I am implementing this on a website where registered, verified users alone can submit their papers for grading. The contents of the paper are going to be weighed in relation to the relationship between the teacher and student and in relation to the course difficulty. The network adapts to the teacher's grading habits for certain classes, the relationship between the teacher and student (e.g., if a teacher dislikes a student you might expect to see a drop in the student's grades), and the course-level (e.g., a teacher shouldn't grade a freshman's paper as harshly as a senior's paper).
However, this approach poses some considerable problems. There is an inherent limit imposed, where the numbers of students, teachers and courses prove to be too much and everything blows up! That's because there is no magic number which can account for every combination of student, teacher and course.
So, I've concluded that each teacher, student, and course must have an individual (albeit arbitrary) weight associated with them, not present in the Neural Network itself. The teacher's weight would describe her grading difficulty, and the student's weight would describe her ability as a writer. The weight of the course would describe the difficulty of the course. Of course, as more and more data is aggregated, the weights should adapt to become more accurate representations.
I realize that there is a relation between teachers and students, teachers and courses, and students and courses; therefore, I plan to make three respective hidden layers which sum the weights of its inputs and apply an activation function. How could I store the weights associated with each teacher, student and course, though?
I have considered storing it in their respective tables, but I don't know how well that would scale (or for that matter, if it would work). I also considered storing it in a file and calling it like that, but I'm sure that would be even worse than storing it in a database.
So the main question I have is: is it (objectively) efficient, in terms of space and computational complexity, and scalable, to store and manage separate, individual weights for each possible element of certain inputs in a SQL database outside of the neural network, if there are a finite (not necessarily small) amount of possible choices for such inputs, and still receive a reasonable output?
Regardless, I would like an explanation as to how come. I believe it would be just fine, but I can't justify it myself and so I'm asking for help. Thanks in advance!
(P.S.: If you realize any problems with my approach not covered in the scope of this question, or have general advice, please include it as an addendum to your answer or please message me).

Methods for automated synonym detection

I am currently working on a neural network based approach to short document classification, and since the corpuses I am working with are usually around ten words, the standard statistical document classification methods are of limited use. Due to this fact I am attempting to implement some form of automated synonym detection for the matches provided in the training. My question more specifically is about resolving a situation as follows:
Say I have classifications of "Involving Food", and one of "Involving Spheres" and a data set as follows:
"Eating Apples"(Food);"Eating Marbles"(Spheres); "Eating Oranges"(Food, Spheres);
"Throwing Baseballs(Spheres)";"Throwing Apples(Food)";"Throwing Balls(Spheres)";
"Spinning Apples"(Food);"Spinning Baseballs";
I am looking for an incremental method that would move towards the following linkages:
Eating --> Food
Apples --> Food
Marbles --> Spheres
Oranges --> Food, Spheres
Throwing --> Spheres
Baseballs --> Spheres
Balls --> Spheres
Spinning --> Neutral
Involving --> Neutral
I do realize that in this specific case these might be slightly suspect matches, but it illustrates the problems I am having. My general thoughts were that if I incremented a word for appearing opposite the words in a category, but in that case I would end up incidentally linking everything to the word "Involving", I then thought that I would simply decrement a word for appearing in conjunction with multiple synonyms, or with non-synonyms, but then I would lose the link between "Eating" and "Food". Does anyone have any clue as to how I would put together an algorithm that would move me in the directions indicated above?
There is an unsupervized boot-strapping approach that was explained to me to do this.
There are different ways of applying this approach, and variants, but here's a simplified version.
Concept:
Start by a assuming that if two words are synonyms, then in your corpus they will appear in similar settings. (eating grapes, eating sandwich, etc.)
(In this variant I will use co-occurence as the setting).
Boot-Strapping Algorithm:
We have two lists,
one list will contain the words that co-occur with food items
one list will contain the words that are food items
Supervized Part
Start by seeding one of the lists, for instance I might write the word Apple on the food items list.
Now let the computer take over.
Unsupervized Parts
It will first find all words in the corpus that appear just before Apple, and sort them in order of most occuring.
Take the top two (or however many you want) and add them into the co-occur with food items list. For example, perhaps "eating" and "Delicious" are the top two.
Now use that list to find the next two top food words by ranking the words that appear to the right of each word in the list.
Continue this process expanding each list until you are happy with the results.
Once that's done
(you may need to manually remove some things from the lists as you go which are clearly wrong.)
Variants
This procedure can be made quite effective if you take into account the grammatical setting of the keywords.
Subj ate NounPhrase
NounPhrase are/is Moldy
The workers harvested the Apples.
subj verb Apples
That might imply harvested is an important verb for distinguishing foods.
Then look for other occurrences of subj harvested nounPhrase
You can expand this process to move words into categories, instead of a single category at each step.
My Source
This approach was used in a system developed at the University of Utah a few years back which was successful at compiling a decent list of weapon words, victim words, and place words by just looking at news articles.
An interesting approach, and had good results.
Not a neural network approach, but an intriguing methodology.
Edit:
the system at the University of Utah was called AutoSlog-TS, and a short slide about it can be seen here towards the end of the presentation. And a link to a paper about it here
You could try LDA which is unsupervised. There is a supervised version of LDA but I can't remember the name! Stanford parser will have the algorithm which you can play around with. I understand it's not the NN approach you are looking for. But if you are just looking to group information together LDA would seem appropriate, especially if you are looking for 'topics'
The code here (http://ronan.collobert.com/senna/) implements a neural network to perform a variety on NLP tasks. The page also links to a paper that describes one of the most successful approaches so far of applying convolutional neural nets to NLP tasks.
It is possible to modify their code to use the trained networks that they provide to classify sentences, but this may take more work than you were hoping for, and it can be tricky to correctly train neural networks.
I had a lot of success using a similar technique to classify biological sequences, but, in contrast to English language sentences, my sequences had only 20 possible symbols per position rather than 50-100k.
One interesting feature of their network that may be useful to you is their word embeddings. Word embeddings map individual words (each can be considered an indicator vector of length 100k) to real valued vectors of length 50. Euclidean distance between the embedded vectors should reflect semantic distance between words, so this could help you detect synonyms.
For a simpler approach WordNet (http://wordnet.princeton.edu/) provides lists of synonyms, but I have never used this myself.
I'm not sure if I misunderstand your question. Do you require the system to be able to reason based on your input data alone, or would it be acceptable to refer to an external dictionary?
If it is acceptable, I would recommend you to take a look at http://wordnet.princeton.edu/ which is a database of English word relationships. (It also exists for a few other languges.) These relationships include synonyms, antonyms, hyperonyms (which is what you really seem to be looking for, rather than synonyms), hyponyms, etc.
The hyperonym / hyponym relationship links more generic terms to more specific ones. The words "banana" and "orange" are hyponyms of "fruit"; it is a hyperonym of both. http://en.wikipedia.org/wiki/Hyponymy Of course, "orange" is ambiguous, and is also a hyponym of "color".
You asked for a method, but I can only point you to data. Even if this turns out to be useful, you will obviously need quite a bit of work to use it for your particular application. For one thing, how do you know when you have reached a suitable level of abstraction? Unless your input is hevily normalized, you will have a mix of generic and specific terms. Do you stop at "citrus","fruit", "plant", "animate", "concrete", or "noun"? (Sorry, just made up this particular hierarchy.) Still, hope this helps.