Caffe requires at least three .prototxt files: for training, for deployment and to define solver parameters.
My training and deployment files contain identical pieces, describing network architecture. Is it possible to refactor this, by moving this common part out of them into a separate file?
You are looking for "all-in-one" network.
See this github discussion for more information.
Apparently, you can achieve this by using not only include {phase: XXX}, but also take advantage of stage and state.
Related
The official document of AllenNLP suggests specifying "validation_data_path" in the configuration file, but what if one wants to construct a dataset from a single source and then randomly split it into train and validation datasets with a given ratio?
Does AllenNLP support this? I would greatly appreciate your comments.
AllenNLP does not have this functionality yet, but we are working on some stuff to get there.
In the meantime, here is how I did it for the VQAv2 reader: https://github.com/allenai/allennlp-models/blob/main/allennlp_models/vision/dataset_readers/vqav2.py#L354
This reader supports Python slicing syntax where you, for example, specify a data_path as "my_source_file[:1000]" to take the first 1000 instances from my_source_file. You can also supply multiple paths by setting data_path: ["file1", "file2[:1000]", "file3[1000-"]]. You can probably steal the top two blocks in that file (line 354 to 369) and put them into your own dataset reader to achieve the same result.
I am building a siamese network from the example in BLVC's site
there they use a simple convolutonal net to generate the features for the contrastive loss function, this is done by copy and pasting the .prototxt of each of the networks in the .prototxt of the final siamese network, the problem is I am using a much larger network, the .prototxt having about 5700 lines.
Is there a directive that allows me to tell it to just "include" that file in runtime? Something in the lines of "input" in LATEX so I don't have a 12k+ lines file.
I try to use pretrained model (VGG 19) to DIGITS but I got this error.
ERROR: Your deploy network is missing a Softmax layer! Read the
documentation for custom networks and/or look at the standard networks
for examples
I try to test with my dataset which has only two classes.
I read this and this try to modify last layer but also I got error. How can I modify layers based on new dataset?
I try to modify the last layer and I got error
ERROR: Layer 'softmax' references bottom 'fc8' at the TRAIN stage however this blob is not included at that stage. Please consider using an include directive to limit the scope of this layer.
You're having a problem because you're trying to upload a "train/val" network when you really need to be uploading an "all-in-one" network. Unfortunately, we don't document this very well. I've created an RFE to remind us to improve the documentation.
Try to adjust the last layers in your network to look something like this: https://github.com/NVIDIA/DIGITS/blob/v4.0.0/digits/standard-networks/caffe/lenet.prototxt#L162-L184
For more information, here is how I've proposed updating Caffe's example networks to all-in-one nets, and here is how I updated the default DIGITS networks to be all-in-one nets.
Say I have an architecture similar to the Layered Architecture Sample. Let's also assume each large box is its own project. The Frameworks box and each layer would then be its own project. If we don't use IoC and instead go the traditional layered approach without interfaces, Service would reference Business which would reference Data and all of these would reference Frameworks.
Now the requirement is that logging be done to the database, so presumably Frameworks would need to reference Service to reach Data. There are two issues with this:
You now have a circular dependency (remember, no interfaces). Most of this can be solved if you use interfaces with IoC (Dependency Injection or Service Locator) and a composition root. Is this the only way or can you somehow do it without an interface?
If you're in Business and need to log something, it has to make the jump to service again. Is there any way to only make the jump from Presentation but not from Service/Business/Data without a composition root?
I know I'm somewhat answering my own question, but I basically want to understand if this architecture is feasible at all without IoC.
Without some inversion of control, there is not too much you can do.
Assuming your language supports something like Reflection in .NET you could try to dynamically invoke code and invert the control at runtime. You don't need interfaces but you might have to mark/decorate or have a convention for the types that you need in the upper layers.
I'm just thinking now about crazy, non-pragmatic approaches: you could post-process the binary and inject the logging code in every layer.
For this example, I would call the logging methods in the Business Layer where logging is needed. It really doesn't make any sense to call up a level. That would an unnecessary abstraction, and it sounds like you've gathered as much.
Is there any abstraction provided in the Services layer for logging that you would require when logging from the business layer? If so, perhaps some sort of facade could be created for the purpose of business layer logging. If you do not require this abstraction, though, I would call the Business logging methods directly.
IMO, Since Logging is cross cutting concern, it should not refer your Data layer. In your question, I see that you had assumed that you are logging into the database. Even if this is your requirement, you have to keep database connection/insert of log records code as seperate from your application data Layer. It will be part of your Logging library rather than part of Data layer. Do NOT treat it as part of data layer. It is this perspective with which you can continue to develop/enhance logging [framework] as well as it will be seperate from your data layer.
From My perspective, data layer only constitues of Application Data Access and not logging. For concrete you can see, NLog or Log4Net libraries and see how they are not concerned with Application's data Access layer strategy.
Hope this helps.
I was consulting several references to discover how I may output trained Weka models into Java source code so that I may use the classifiers I am training in actual code for research applications I have been developing.
As I was playing with Weka 3.7, I noticed that while it does output Java code to its main text buffer when use simpler classification (supervised in my case this time) methods such as J48 decision tree, it removes the option (rather, it voids it by removing the ability to checkmark it and fades the text) to output Java code for RandomTree and RandomForest (which are the ones that give me the best performance in my situation).
Note: I am clicking on the "More Options" button and checking "Output source code:".
Does Weka not allow you to output RandomTree or RandomForest as Java code? If so, why? Or if it does and just doesn't put it in the output buffer (since RF is multiple decision trees which I imagine it doesn't want to waste buffer space), how does one go digging up where in the file system Weka outputs java code by default?
Are there any tricks to get Weka to give me my trained RandomForest as Java code? Or is Serialization of the output *.model files my only hope when it comes to RF and RandomTree?
Thanks in advance to those who provide help.
NOTE: (As an addendum to the answer provided below) If you run across a similar situation (requiring you to use your trained classifier/ML model in your code), I recommend following the links posted in the answer that was provided in response to my question. If you do not specifically need the Java code for the RandomForest, as an example, de-serializing the model works quite nicely and fits into Java application code, fulfilling its task as a trained model/hardened algorithm meant to predict future unlabelled instances.
RandomTree and RandomForest can't be output as Java code. I'm not sure for the reasoning why, but they don't implement the "Sourceable" interface.
This explains a little about outputting a classifier as Java code: Link 1
This shows which classifiers can be output as Java code: Link 2
Unfortunately I think the easiest route will be Serialization, although, you could maybe try implementing "Sourceable" for other classifiers on your own.
Another, but perhaps inconvenient solution, would be to use Weka to build the classifier every time you use it. You wouldn't need to load the ".model" file, but you would need to load your training data and relearn the model. Here is a starters guide to building classifiers in your own java code http://weka.wikispaces.com/Use+WEKA+in+your+Java+code.
Solved the problem for myself by turning the output of WEKA's -printTrees option of the RandomForest classifier into Java source code.
http://pielot.org/2015/06/exporting-randomforest-models-to-java-source-code/
Since I am using classifiers with Android, all of the existing options had disadvantages:
shipping Android apps with serialized models didn't reliably work across devices
computing the model on the phone took too much resources
The final code will consist of three classes only: the class with the generated model + two classes to make the classification work.