In T5 model training stage, how to use autoregressive fashion?

In T5 model training stage, how to use autoregressive fashion? - deep-learning

I want try to use autoregressive fashion to train T5, where do I set it up? Also, I'd like to ask which piece of code is implementing teacher forcing and autoregressive?
Also, I'd like to ask which piece of code is implementing teacher forcing and autoregressive?

Related

How to encode poker cards?

I am currently working on a poker AI and I am stuck on this question: What is the best way to encode poker cards for my AI? I am using deep reinforcement learning techniques and I just don't know how to anwser my question.
The card information is stored as a string. For example: "3H" would be "three of hearts". I thought about ranking the cards and then attaching values to them such that a high-rated card like AH ("Ace of hearts") would get a high number like 52 or something like that. The problem with this approach is that it doesn't take the suits into acccount.
I have seen some methods where they just assign a number to each and every card such that at the end there are 52 numbers from 0-51 (https://www.codewars.com/kata/52ebe4608567ade7d700044a/javascript). The problem I see with that is that my neural net wouldn't or at least have difficulties getting the connection between similar cards like Aces ('cause as in the link above one Ace is labeled with a 0 the other one with 13 etc.).
Can someone please help me with this question such that the encodings take care of the: suits, values, ranks, etc and my NN would be able to get the connections between similar cards.
Thanks in andvance

Model suggestion: Keyword spotting

I want to predict the occurrences of the word "repeat" in a speech as well as the word's approximate duration. For this task, I'm planning to build a Deep Learning model. I've around 50 positive as well as 50 negative utterances (I couldn't collect more).
Initially I've searched for any pretrained models for keyword spotting, but I couldn't get a good one.
Then I tried Speech Recognition models (Deep Speech), but it couldn't predict the exact repeat words as my data follows Indian accent. Also, I've thought that going for ASR models for this task would be a over-killing one.
Now, I've split the entire audio into chunk of 1 secs with 50% overlapping and tried a binary audio classification in each chunk that is whether the chunk has the word "repeat" or not. For building the classification model, I calculated the MFCC features and build a sequence model on the top of it. Nothing seems to work for me.
If anyone already worked with this kind of task, please provide me with a correct method/resources to build a DL model for this task. Thanks in advance!

U-Net segmentation without having mask

I am new to deep learning and Semantic segmentation.
I have a dataset of medical images (CT) in Dicom format, in which I need to segment tumours and organs involved from the images. I have labelled organs contoured by our physician which we call it RT structure stored in Dicom format also.
As far as I know, people usually use "mask". Does it mean I need to convert all the contoured structure in the rt structure to mask? or I can use the information from the RT structure (.dcm) directly as my input?
Thanks for your help.

There is a special library called pydicom that you need to install before you can actually decode and later visualise the X-ray image.
Now, since you want to apply semantic segmentation and you want to segment the tumours, the solution to this is to create a neural network which accepts as input a pair of [image,mask], where, say, all the locations in the mask are 0 except for the zones where the tumour is, which are marked with 1; practically your ground truth is the mask.
Of course for this you will have to implement your CustomDataGenerator() which must yield at every step a batch of [image,mask] pairs as stated above.

Using CNN for Bidding in Belote card game

I have this idea to use CNN to learn a Belote Bidder.
We have 32 cards from which the bidder guy has only 5. I put them in an "image" 4x8, arr[r][s] = 1 if he has the card from suit s and rank r, and 0 if he doesnt.
Only 5 ones in the image.
Then I simulate N times the possible bids with Monte Carlo algorithm and make a vector with the probabilities for each bid. They sum to 1.
After this I try to learn the CNN but no matter what I do, it doesnt learn well and it doesnt generalize. Maybe the input data is not useful for CNN, perhaps I need to use something else, but the most important thing is - I need it to generalize well, because the computation of the output is very expensive, and I want to do it only once, save the CNN and then use it.

what is the principle of readout and teacher forcing?

These days I study something about RNN and teacher forcing. But there is one point that I can't figure out. What is the principle of readout and teacher forcing? How can we feeding the output(or ground truth) of RNN from the previous time step back to the current time step, by using the output as features together with the input of this step, or using the output as this step's cell state? I have read some paper but still it confused me.o(╯□╰)o. Hoping someone can answer for me。

Teacher forcing is the act of using the the ground truth as the input for each time step, rather than the output of the network, the following is some pseudo code to describe the situation.
x = inputs --> [0:n]
y = expected_outputs --> [1:n+1]
out = network_outputs --> [1:n+1]
teacher_forcing():
for step in sequence:
out[step+1] = cell(x[step])
As you can see rather than feeding the output of the network at the previous time step as input it instead provides the ground truth.
It was originally motivated to avoid BPTT (back propagation through time) for models that dont contain hidden to hidden connections, eg GRU's (gated recurrent units).
It can also be used in a training regime, the idea being that from the beginning of the train to the end you slowly decrease the amount of teacher forcing. This has been shown to have regularizing effects on the network. The paper linked here has further reading or the Deep Learning Book is good too.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

In T5 model training stage, how to use autoregressive fashion? - deep-learning

I want try to use autoregressive fashion to train T5, where do I set it up? Also, I'd like to ask which piece of code is implementing teacher forcing and autoregressive? Also, I'd like to ask which piece of code is implementing teacher forcing and autoregressive?

Related

How to encode poker cards?

Model suggestion: Keyword spotting

U-Net segmentation without having mask

Using CNN for Bidding in Belote card game

what is the principle of readout and teacher forcing?

Categories

Resources