Shuffle in caffe with multiple tmdbs - deep-learning

I am using caffe and the lmdb database to feed my data to the network. However, I have two different lmdbs for input and ground_truth since my ground_truth are images as well. Is it possible to use shuffle anyways? If yes, can I just set shuffle: true for both lmdbs as param?
layer {
name: "data"
type: "Data"
top: "data"
include {
phase: TRAIN
}
transform_param {
mean_value: X
}
data_param {
source: "..."
batch_size: X
backend: LMDB
}

If you use layer type "Data" you can't use shuffle as there is no shuffle parameter in data_param.
As for layer type "ImageData" you can't use lmdb as data source as source file should be a text file with image address and label. But it has shuffle parameter. If you look inside image_data_layer.cpp you'll find if shuffle is true then image sources are shuffled in each epoch using Fisher–Yates algorithm. If you use two different ImageData layer then ShuffleImages() will be called for each of them and it is unlikely that two shuffle will generate the same sequence. So you can't use shuffle in any of these two ImageData layer.

Related

Forge Viewer_New model browser_Linked rvt files

Thank you all in advance for your help.
My problem is that in the model browser I can't see the elements organized by model. They are all seen together in no order. I have been able to load linked files via jobpayload with rootFilename. I have seen a lot of information about it but if someone has done it or has an idea of how to do it, I would greatly appreciate the help or start. All the best.
If the model is a composite Revit model translation, the host and linked RVT files are translated by a zip package, the model structure will be merged into a single tree, so it won't organize the tree structure by models.
However, we can tell which object is from the linked RVT. See the concept here:
https://stackoverflow.com/a/64672951
and here is a code example that I use this function to get their dbIds.
async getRevitLinkedElementIds(rvtLinkExternalIds, model) {
const modelKey = model.getModelKey();
const externalIdMap = this.modelExternalIdMaps[modelKey];
const entities = Object.entries(externalIdMap);
const linkedElementIds = rvtLinkExternalIds.map(instanceId => {
return entities.filter(entity => entity[0].includes(`${instanceId}/`)).map(entity => entity[1]);
})
.flat();
return linkedElementIds;
}
I think you can make use of the linkedElementIds, and then call model.getInstanceTree().getNodeParentId( dbId ) repeatedly until you get the root node id, so that you can get name of non-leaf nodes, e.g., Family Type, Family, and Category, to rebuild your own tree nodes using jstree.js. (Don't use non-leaf nodes' dbIds, since they are shared by the host and linked contents)
Afterward, you can build node data of the jstree.js like this for each models (host and links) to expand the tree structure of my custom panel in the code example.
[
{ id1, externalId: -1, type: 'revit-category', text: 'Curtain Panels' },
{ id2, externalId: -1, type: 'revit-family', text: 'System Panel' },
{ id3, externalId: -1, type: 'revit-family-type', text: 'Glazed' },
{ id4, externalId: '03d1b04e-ba90-4a0e-8fe2-eca95236e26a/ab343b7e-3705-4b87-bacc-33c06a6cee1d-000ee82e', type: 'revit-elements', text: 'System Panel [976942]' }
]

Pass multiple files to AutodeskForge Design Automation API

If i use just 1 file it works perfectly, but with more than 1 it fails
This is my request
{
"Arguments":{
"InputArguments":[
{
"Resource":"https://s3url.com",
"Name":"HostDwg1-050A-014"
},
{
"Resource":"https://s3url.com",
"Name":"HostDwg1-050A-015"
}
],
"OutputArguments":[
{
"Name":"Result1-050A-014",
"HttpVerb":"PUT",
"Resource":"https:://s3url.com",
"StorageProvider":"Generic"
},
{
"Name":"Result1-050A-015",
"HttpVerb":"PUT",
"Resource":"https://s3url.com",
"StorageProvider":"Generic"
}
]
},
"ActivityId":"PlotToPDF",
"Id":""
}
This is the error i get
The number of Arguments is bigger than the number of Parameters.
Parameter name: Count
How have to be done the request to convert more than one file, without doing a request for each file? thanks
The PlotToPDF activity declares exactly one input parameter and exactly one output parameter. An activity is like a function in a programming language: you can only provide as many arguments as there are parameters. So...
If you want to have a workitem that has more than one input/output argument then you should define an new custom activity that has more than one input/output parameters.
If you want plot multiple files then you should simply submit multiple workitems.

Is it possible to check empty label during training using HDF5 layer?

I am using HDF5 data layer. The layer will read the image and its label (from the list.txt) and feed them to the network.
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "./list.txt"
batch_size: 10
shuffle: true
}
}
Because my list.txt contain some image with the empty label (means black image). I want to check if the empty label is selected during the training. In CAFFE, can we do it?

Extracting scores from GoogleNet

I would like to extract the scores from the GoogleNet network from caffe models but I don't quite understand which layers keep the scores.
I get:
Check failed: feature_extraction_net->has_blob(blob_names[i]) Unknown feature blob name loss3/top-5 in the network ./train_val.prototxt
Any suggestion?
The layer loss3/top5 is an Accuracy layer. I will assume that this is want you meant by scores.
By default, you will have something like this
layer {
name: "loss3/top-5"
type: "Accuracy"
bottom: "loss3/classifier"
bottom: "label"
top: "loss3/top-5"
include {
phase: TEST
}
accuracy_param {
top_k: 5
}
}
If you want to have this layer in training, you can remove or comment out (#) the include section.

Result of multiple test stages is incorrect

I tried to make use of the test_state functionality in caffe solver while training. To implement this I added the following code to solver.prototxt
test_state: { stage: 'test-on-testSet0' }
test_iter: 726
test_state: { stage: 'test-on-testSet1' }
test_iter: 363
Then I modified the train_val.prototxt like this:
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TEST
stage: "test-on-testSet0"
}
transform_param {
mirror: false
scale: 0.0039215684
}
image_data_param {
source: "./set0.lst"
batch_size: 1
}
}
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TEST
stage: "test-on-testSet1"
}
transform_param {
mirror: false
scale: 0.0039215684
}
image_data_param {
source: "./set0.lst"
batch_size: 2
}
}
It has to be noted that both the test cases are ideally the same and the test runs on the complete set of images present in the ./set0.lst file.
Still while training using build/tools/caffe, the results of the accuracy printed for both the test states are not identical.
The accuracy layers are connected correctly too.
What could be the reason for this mismatch?
I was able to fix the issue by using the same batch_size for all the test_states. Looks like caffe expects that all the test cases have the same batch_size.
Hope this answer might help someone in the future.
Btw, I guess, this could be issued as a bug to the caffe community. I was facing this issue with the latest commit of caffe (df412ac).