I have tried to train customized models several times for language pair English-German of category Technology, but all got Trainingfailed status
FYI: Training Status Screenshot
From "Training Details", it seems sentences alignment works well. And there is no explicit error message found.
FYI: Training Details Screenshot
Is there any suggestion about solving this problem?
Thanks very much.
We were experiencing a transient issue in our training environment that has now been resolved. Please retry your training and it should succeed.
Related
I am trying to train and test the VQA model on https://github.com/akirafukui/vqa-mcb with my own dataset.
I learned a deeplearning unit at Uni, but I still don't know how to make use of the model in my code. Also, I don't understand the Prerequisites section in readme.
Could you please give me some directions or useful resources on the web? Thank you very much!
Since yesterday, I have attempted several custom model trainings and, except for one of the trainings, all of them ended up with status “Trainingfailed”. And the portal does not provide any details on the reason why the trainings failed…
It is not the first time we experience this problem. It is common that a model training fails once or twice and is successful on the third or fourth attempt, always with the same training data.
Can you assist?
Thanks.
We were experiencing an issue that has now been resolved. Training is running properly now.
Here is the project that I have created in your new Custom Translator platform:
https://portal.customtranslator.azure.ai/Scripts/NgApp/text-translator/projects/7f1b6f9a-d4f4-4ea1-8492-69075339422f
I have tried to run the a training session several time but it does not seem to complete.
I have made sure to add my subscription, could you please verify where there is something that I am doing wrong? Thank you very much
It seems you have three training jobs which have completed, two with success and one has failed. Please check and reach out to Custom Translator support at custommt#microsoft.com for further questions about the Custom Translator.
I have a couple of files in 2 different projects that are "Training" for more than 24h now in the new Custom Translator (not Translator Hub). Each is only 26,000 sentences approximately, so it doesn't really justify the wait. I have no other status screens or resources to search, so any ideas would be welcome.
We were experiencing some issues related to training last week that were causing
model training times to run longer than they should. We have since fixed the problem and I hope that your job has completed at this time.
I was training an bvlc_googlenet using Caffe on a set of images with 5 output values. The results of testing I received is very unclear for me as test accuracy in many cases was above 1. How should I interpret those results? Is it an error?
Here are test logs of the trainning obtained using command
/home/ubuntu/caffe/tools/extra/parse_log.sh train.log
https://pastebin.com/8KN6g7Rx
Please see this caffe Pull Request fixing this bug.
As a workaround (if you do not want to merge this PR) you can use Accuracy only during testing and disable it for training.