In Caffe2, is there any caffemodel that deals with the chatbot conversation.
Is there any model that helps to work with the text classification / analysis
Is there any DNN example / sample that deals with the text statements as given in the Caffe2 (https://caffe2.ai/docs/applications-of-deep-learning.html)
"As more complex bots are written using DNN, their ability to understand your statements, and more importantly,
the context, the bots will be able to hold longer, more meaningful conversations without you even realizing you are not chatting with a real person."
Thanks in advance...
Take a look at this: https://github.com/caffe2/caffe2/tree/master/caffe2/python/models/seq2seq
I have used this model for translation.
Related
I'm pretty new with Custom Translator and I'm working on a fashion-related EN_KO project.
There are many cases where a single English term has two possible translations into Korean. An example: if "fastening"is related to "bags, backpacks..." is 잠금 but if it's related to "clothes, shoes..." is 여밈.
I'd like to train the machine to recognize these differences. Could it be useful to upload a phrase dictionary? Any ideas? Thanks!
The purpose of training a custom translation system is to teach it how to translate terms in context.
The best way to teach the system how to translate is training with parallel documents of full sentence prose: the same document in two languages. A translation memory extract in a TMX or XLIFF file is the best material, but many other document formats are suitable as well, as long as you have both languages. Have at least 10000 sentences in both languages, upload to http://customtranslator.ai, and build a custom system with it.
If you have documents in Korean that are representative of the terminology and style you want to achieve, without an English match, you can automatically translate those to English, and add to the training material as parallel documents. Be sure to not use the automatically translated documents in the other direction.
A phrase dictionary is of limited help, because it is unaware of context. It is useful only in bootstrapping your custom system or for very rare terms where you cannot find or create a sentence.
Greetings to everyone,
I want to design a system that is able to generate stories or poetry based on a large dataset of text, without being needed to feed a text description/start/summary as input at inference time.
So far I did this using RNN's, but as you know they have a lot of flaws. My question is, what are the best methods to achieve this task at the time?
I searched for possibilities using Attention mechanisms, but it turns out that they are fitted for translation tasks.
I know about GPT-2, Bert, Transformer, etc., but all of them need a text description as input, before the generation and this is not what I'm seeking. I want a system able to generate stories from scratch after training.
Thanks a lot!
edit
so the comment was: I want to generate text from scratch, not starting from a given sentence at inference time. I hope it makes sense.
yes, you can do that, that's just simple code manipulation on top of the ready models, be it BERT, GPT-2 or LSTM based RNN.
How? You have to provide random input to the model. Such random input can be randomly chosen word or phrase or just a vector of zeroes.
Hope it helps.
You have mixed up several things here.
You can achieve what you want either using LSTM based or transformer based architecture.
When you said you did it with RNN, you probably mean that you have tried LSTM based sequence to sequence model.
Now, there is attention in your question. So you can use attention to improve your RNN but it is not a required condition. However, if you use transformer architecture, then it is built in the transormer blocks.
GPT-2 is nothing but a transformer based model. Its building block is a transformer architecture.
BERT is also another transformer based architecture.
So to answer your question, you should and can try using LSTM based or transformer based architecture to achieve what you want. Sometimes such architecture is called GPT-2, sometimes BERT depending on how it is realized.
I encourage you to read this classic from Karpathy, if you understand it then you have cleared most of your questions:
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
I'm looking for some topic modeling tool which can be applicable to a large data set.
My current data set for training is 30 GB. I tried MALLET topic modeling, but always I got OutOfMemoryError.
If you have any tips, please let me know.
There are many options available to you, and this response is agnostic as to how they compare.
I think that the important thing with such a large dataset is the method of approximate posterior inference used, and not necessarily the software implementation. According to this paper, online Variational Bayes inference is much more efficient, in terms of time and space, than Gibbs sampling. Though I've never used it, the gensim package looks good. It's in python, and there are in-depth tutorials at the project's webpage.
For code that comes straight from the source, see the webpage of David Blei, one of the authors of the LDA model, here. He links to more than a few implementations, in a variety of languages (R, Java, C++).
I suggest using a "big data" tool such as graphlab, which supports topic modeling: http://docs.graphlab.org/topic_modeling.html
The GraphLab Create topic model toolkit (with Python API bindings) should be able to handle a dataset that large.
I wanted some input on an interesting problem I've been assigned. The task is to analyze hundreds, and eventually thousands, of privacy policies and identify core characteristics of them. For example, do they take the user's location?, do they share/sell with third parties?, etc.
I've talked to a few people, read a lot about privacy policies, and thought about this myself. Here is my current plan of attack:
First, read a lot of privacy and find the major "cues" or indicators that a certain characteristic is met. For example, if hundreds of privacy policies have the same line: "We will take your location.", that line could be a cue with 100% confidence that that privacy policy includes taking of the user's location. Other cues would give much smaller degrees of confidence about a certain characteristic.. For example, the presence of the word "location" might increase the likelihood that the user's location is store by 25%.
The idea would be to keep developing these cues, and their appropriate confidence intervals to the point where I could categorize all privacy policies with a high degree of confidence. An analogy here could be made to email-spam catching systems that use Bayesian filters to identify which mail is likely commercial and unsolicited.
I wanted to ask whether you guys think this is a good approach to this problem. How exactly would you approach a problem like this? Furthermore, are there any specific tools or frameworks you'd recommend using. Any input is welcome. This is my first time doing a project which touches on artificial intelligence, specifically machine learning and NLP.
The idea would be to keep developing these cues, and their appropriate confidence intervals to the point where I could categorize all privacy policies with a high degree of confidence. An analogy here could be made to email-spam catching systems that use Bayesian filters to identify which mail is likely commercial and unsolicited.
This is text classification. Given that you have multiple output categories per document, it's actually multilabel classification. The standard approach is to manually label a set of documents with the classes/labels that you want to predict, then train a classifier on features of the documents; typically word or n-gram occurrences or counts, possibly weighted by tf-idf.
The popular learning algorithms for document classification include naive Bayes and linear SVMs, though other classifier learners may work too. Any classifier can be extended to a multilabel one by the one-vs.-rest (OvR) construction.
A very interesting problem indeed!
On a higher level, what you want is summarization- a document has to be reduced to a few key phrases. This is far from being a solved problem. A simple approach would be to search for keywords as opposed to key phrases. You can try something like LDA for topic modelling to find what each document is about. You can then search for topics which are present in all documents- I suspect what will come up is stuff to do with licenses, location, copyright, etc. MALLET has an easy-to-use implementation of LDA.
I would approach this as a machine learning problem where you are trying to classify things in multiple ways- ie wants location, wants ssn, etc.
You'll need to enumerate the characteristics you want to use (location, ssn), and then for each document say whether that document uses that info or not. Choose your features, train your data and then classify and test.
I think simple features like words and n-grams would probably get your pretty far, and a dictionary of words related to stuff like ssn or location would finish it nicely.
Use the machine learning algorithm of your choice- Naive Bayes is very easy to implement and use and would work ok as a first stab at the problem.
Can anybody suggest a way to process the information and analyze the data from the comments users post on a article in my website.
I exactly want to process the comments as follows:
Example: Like on a article on computerization may get the following comments:
I love computerization as it makes the work easier.
Computerization is spreading unemployment as 1 computer can work better than 4 people.
How I process this information -
: I take the comments and try to recognize some predefined[and extensible] keywords in it.
Assuming that you are trying to extract some useful information from the comments, you could apply some machine learning to the comments to classify or categorize the data contained within, the sentiments etc.
There are number of different types of learning you can do on the text, however I personally recommend using support vector machines or a naive bayes classifier to be able to categorize and analyze the comments. You could also possibly use clustering, but there needs to be an element of natural language processing in the solution you choose. There are number of different libraries that you can use to implement the code to use either, i.e. svmlight, javaml, etc. I have personally used javaml and it is a good library.