digits 4.0 0.14.0-rc.3 /Ubuntu (aws)
training a 5 class GoogLenet model with about 800 training samples in each class. I was trying to use the bvlc_imagent as pre-trained model. These are the steps I took:
downloaded imagenet from http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel and placed it in /home/ubuntu/models
2.
a. Pasted the "train_val.prototxt" from here https://github.com/BVLC/caffe/blob/master/models/bvlc_reference_caffenet/train_val.prototxt into the custom network tab and
b. '#' commented out the "source" and "backend" lines (since it was complaning about them)
In the pre-trained models text box pasted the path to the '.caffemodel'. in my case: "/home/ubuntu/models/bvlc_googlenet.caffemodel"
I get this error:
ERROR: Cannot copy param 0 weights from layer 'loss1/classifier'; shape mismatch. Source param shape is 1 1 1000 1024 (1024000); target param shape is 6 1024 (6144). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
I have pasted various train_val.prototext from github issues etc and no luck unfortunately,
I am not sure why this is getting so complicated, in older versions of digits, we could just enter the path to the folder and it was working great for transfer learning.
Could someone help?
Rename the layer from "loss1/classifier" to "loss1/classifier_retrain".
When fine-tuning a model, here's what Caffe does:
# pseudo-code
for layer in new_model:
if layer.name in old_model:
new_model.layer.weights = old_model.layer.weights
You're getting an error because the weights for "loss1/classifier" were for a 1000-class classification problem (1000x1024), and you're trying to copy them into a layer for a 6-class classification problem (6x1024). When you rename the layer, Caffe doesn't try to copy the weights for that layer and you get randomly initialized weights - which is what you want.
Also, I suggest you use this network description which is already set up as an all-in-one network description for GoogLeNet. It will save you some trouble.
https://github.com/NVIDIA/DIGITS/blob/digits-4.0/digits/standard-networks/caffe/googlenet.prototxt
Related
AssertionError: train: No labels found in /content/dataset/test/labels.cache, can not start training
Automatically generated cache files
I'm doing yolo5
An error occurs during the configuration and training a dataset.
I understand that the command automatically creates a cache file under the train folder. I've confirmed that it's actually made
These errors appear. What's the problem? I ask for your help me.
An error occurs when only the dataset is different from the same code
A customer has created a custom Inventor Material Library: all his parts are defined to use these custom materials in order to obtain a very realistic representation of model.
When I try to convert his assembly files (exported with Pack & Go) with Autodesk FORGE Model Derivative, the output models (no different between SVF and SVF2 format) contains some white surfaces associated to the parts defined with custom materials.
Interessant: in developer console of browser I've found an error HTTP 400 related to the Galvinized_2_svf_tex_mod.png.
What can I do in order to obtain a better SVF model?
I've found the cause: see Appearances with enabled Self-Illumination are translated to white when creating a shared view from Inventor
The Material Library of customer has the "Self Illumination" active!
I just had a similar issue. I removed the "Self Illumination" and the color came through. I then added "Self Illumination" back to the color, but added the assembly file to my zip with copy and paste in leu of Pack and Go. And the color still works in the viewer.
Don't know if that always works, but if anyone else needs SI, it might be worth a try.
P.S. Good to know that shared views can be a quick test of the Viewer. Hadn't thought of that before.
I'm new to caffe and after successfully running an example I'm trying to use my own data. However, when trying to either write my data into the lmdb data format or directly trying to use the solver, in both cases I get the error:
E0201 14:26:00.450629 13235 io.cpp:80] Could not open or find file ~/Documents/ChessgameCNN/input/train/731_1.bmp 731
The path is right, but it's weird that the label 731 is part of this error message. That implies that it's reading it as part of the path instead of as a label. The text file looks like this:
~/Documents/ChessgameCNN/input/train/731_1.bmp 731
Is it because the labels are too high? Or maybe because the labels don't start with 0? I've searched for this error and all I found were examples with relatively few labels, about ~1-5, but I have about 4096 classes of which I don't always actually have examples in the training data. Maybe this is a problem, too (certainly for learning, at least, but I didn't expect for it to give me an actual error message). Usually, the label does not seem to be part of this error message.
For the creation of the lmdb file, I use the create_imagenet.sh from the caffe examples. For solving, I use:
~/caffe/build/tools/caffe train --solver ~/Documents/ChessgameCNN/caffe_models/caffe_model_1/solver_1.prototxt 2>&1 | tee ~/Documents/ChessgameCNN/caffe_models/caffe_model_1/model_1_train.log
I tried different image data types, too: PNG, JPEG and BMP. So this isn't the culprit, either.
If it is really because of my choice of labels, what would be a viable workaround for this problem?
Thanks a lot for your help!
I had the same issue. Check that lines in your text file don't have spaces in the end.
I was facing a similar problem with convert_imageset. I have solved just removing the trailing spaces in the text file which contains the labels.
I am training a Faster-RCNN(VGG-16 architecture) on INRIA Person dataset. I was trained for 180,000 training steps. But when I evaluate the network, it gives varying result with the same image.
Following are the images
I am not sure why does it give different results for the same set of weights.The network is implemented in caffe.
Any insight into the problem is much appreciated.
Following image shows different network losses
Recently, I also prepared my own dataset to training, and got the similar results as yours.
The following is my experiences and share with you:
Check input format include images and your bounding box csvfile or xml (where always put on Annotation file) whether all bounding box (x1, y1, x2, y2) correct?
Then check roidb/imdb loading python script (put on FasterRCNN/lib/datasets/pascal_roi.py, and maybe yours is inria.py),
make sure _load_xxx_annotation() correctly loaded all bounding box by print bounding_box and filename. Importantly, if your script is copied and modified the pascal_roi.py or any prototype script, please check whether it will saved all roi and image info into cache file, if yes you need to delete that cache file when you change any configure files and re-try.
Finally, make sure that all bounding box correctly generating when network is training (e.g. print minibatch variable to show filename and corresponding x1, y1, x2, y2 shown at FasterRCNN/lib/roi_data_layer/layer.py). If roi generator generate correctly, the bounding box will not differ with your manually select bounding box largely.
Some similar issue may cause this problem as well.
I have a model with simple animation designed on 3Ds MAX (my version is 2013). How can I export it to .json extension including its animation (for usage with Three.js)? I've tried several times to export it with tools from Three.js package but these go to waste ("morphTargets" array still empty).How can I handle this problem? Is there any way else? Do I have to use 3D Maya to make animation for my model?Thanks for reading!
To get morph targets you need to export each morph target object individually as a .obj file, including the unmorphed mesh. Then you need to pass them all to the python convert script found in /utils/converters/obj:
python convert_obj_three.py -i unmorphedmesh.obj -m 'morphmesh1.obj morphmesh2.obj' -o comppiledTargets.js
and then your morph targets will be populated.
If you want a boned / rigged mesh, I wrote a blog post about detailing the entire rigged 3ds max > threejs export.
It is currently impossible to have both a boned / rigged mesh and morph targets coming from 3ds max. There is an example of both skinning and morphing in threejs, so it's possible, but that model was made in blender.
this StackOverflow post suggests that the 'correct' pathway is Max > OBJ > ThreeJs -- which would mean no animations, since OBJ is not an animation format. The three JS Max exporter here also does not include animation (neither does the Maya one), AFAICT
Sorry I don't have a better suggestion.