Mask R-CNN annotation tool - deep-learning

I’m new to deep learning and I was reading some state of art papers and I found that mask r-cnn is utterly used in segmentation and classification of images. I would like to apply it to my MSc project but I got some questions that you may be able to answer. I apologize if this isn’t the right place to do it.
First, I would like to know what are the best strategy to get the annotations. It seems kind of labor intensive and I’m not understanding if there is any easy way. Following that, I want to know if you know any annotation tool for mask r-cnn that generates the binary masks that are manually done by the user.
I hope this can turn into a productive and informative thread so any suggestion, experience would be highly appreciated.
Regards

You can use MASK-RCNN, I recommend it, is a two-stage framework, first you can scan the image and generate areas likely contain an object. And the second stage classifies the proposal drawing bounding boxes.
But the two-big question
how to train a model from scratch? And What happens when we want to
train our own dataset?
You can use annotations downloaded from the internet, or you can start creating your own annotations, this takes a lot of time!
You have tools like:
VIA GGC image annotator
http://www.robots.ox.ac.uk/~vgg/software/via/via_demo.html
it's online and you don't have to download any program. It is the one that I recommend you, save the images in a .json file, and so you can use the class of ballons that comes by default in SAMPLES in the framework MASK R-CNN, you would only have to put your json file and your images and to train your dataset.
But there are always more options, you have labellimg which is also used for annotation and is very well known but save the files in xml, you will have to make a few changes to your Class in python. You also have labelme, labelbox, etc.

Related

What is the format for the training/testing data for a Computer Vision model

I am trying to build a CV model for detecting objects in videos. I have about 6 videos that have the content I need to train my model. These are things like lanes, other vehicles, etc. that I’m trying to detect.
I’m curious about the format of the dataset I need to train my model with. I can have each frame of each video turn into images and create a large repository of images to train with or I can use the videos directly. Which way do you think is better?
I apologize if this isn't directly a programming question. I'm trying to assemble my data and I couldn't make up my mind about this.
Yolo version 3 is a good starting point. The trained model will have a .weight file and a .cfg file which can be used to detect object from webcam, video in computer or, in Android with opencv.
In opencv python, cv.dnn.readNetFromDarknet("yolov3_tiny.cfg", "CarDetector.weights") can be used load the trained model.
In android similar code,
String tinyYoloCfg = getPath("yolov3_tiny.cfg", this);
String tinyYoloWeights = getPath("CarDetector.weights", this);
Net tinyYolo = Dnn.readNetFromDarknet(tinyYoloCfg, tinyYoloWeights);
Function reference can be found here,
https://docs.opencv.org/4.2.0/d6/d0f/group__dnn.html
Your video frames need to be annotated with a tool that generates bounding boxes in yolo format and there are quite a few available. In order to train custom model this repository contains all necessary information,
https://github.com/AlexeyAB/darknet

How can I start writing the code for my layer?

I have seen that researchers are adding some functionalities to the original version of Caffe and use those layers and functionalities according to what they need and then these versions are shared through Github. If I am not mistaken, there are two ways: 1) by recompiling Caffe after adding c++ and Cuda versions of layers. 2) writing a python code for the functionality and call it as python layer in Caffe.
I want to add a new layer to Caffe based on my research problem. I really do not from which point should I start writing the new layer and which steps I should consider.
My questions are:
1) Is there any documentation or any learning materials that I can use it for writing the layer?
2) Which way of above-mentioned methods of adding a new layer is preferred?
I really appreciate any help and guidance
Thanks a lot
For research purposes, for "playing around", it is usually more convenient to write a python layer: saves you the hustle of compiling etc.
You can find a short tutorial on "Python" layer here.
On the other hand, if you want better performance you should write a native c++ code for your layer.
You can find a short explanation about it here.

Scala libraries for a GUI with vector graphics

If I wanted to create a "good old" desktop GUI program with some basic selectable vector graphics (arrows, colored boxes, text) to show mostly text and some diagrams (no need for 3D or particles, etc.) that then can be edited by the user, which Scala library would I use, and why? I'd want to build a tool for something similar to drawing diagrams.
Mostly I don't want to use things like D3.js or other web SVG stuff because it is painfully slow and can't show the amounts of content I want it to. But if there are exceptional advantages by using, say, Scala.js, for this purpose, it could be still of interest if there are no better means.
Alternatively, someone can point me to where this had been discussed? Did not find anything on Google or here.
Scala-Swing is a wrapper for Java Swing which is quite nice. As you may know, below Swing lies Java2D, so you will most likely want to look for Java2D based libraries. There aren't that many for Scala that I'm aware of, but more for Java (which you can easily use).
JFreeChart and scala-chart - more for charting
JGraph
JUNG
Processing might have some libraries for your use case
JHotDraw - not sure it's still maintained
NetBeans - not sure it's still maintained
Prefuse
And of course, if it's simple enough, there is nothing wrong with using Java2D directly.

Weka: Limitations on what one can output as source?

I was consulting several references to discover how I may output trained Weka models into Java source code so that I may use the classifiers I am training in actual code for research applications I have been developing.
As I was playing with Weka 3.7, I noticed that while it does output Java code to its main text buffer when use simpler classification (supervised in my case this time) methods such as J48 decision tree, it removes the option (rather, it voids it by removing the ability to checkmark it and fades the text) to output Java code for RandomTree and RandomForest (which are the ones that give me the best performance in my situation).
Note: I am clicking on the "More Options" button and checking "Output source code:".
Does Weka not allow you to output RandomTree or RandomForest as Java code? If so, why? Or if it does and just doesn't put it in the output buffer (since RF is multiple decision trees which I imagine it doesn't want to waste buffer space), how does one go digging up where in the file system Weka outputs java code by default?
Are there any tricks to get Weka to give me my trained RandomForest as Java code? Or is Serialization of the output *.model files my only hope when it comes to RF and RandomTree?
Thanks in advance to those who provide help.
NOTE: (As an addendum to the answer provided below) If you run across a similar situation (requiring you to use your trained classifier/ML model in your code), I recommend following the links posted in the answer that was provided in response to my question. If you do not specifically need the Java code for the RandomForest, as an example, de-serializing the model works quite nicely and fits into Java application code, fulfilling its task as a trained model/hardened algorithm meant to predict future unlabelled instances.
RandomTree and RandomForest can't be output as Java code. I'm not sure for the reasoning why, but they don't implement the "Sourceable" interface.
This explains a little about outputting a classifier as Java code: Link 1
This shows which classifiers can be output as Java code: Link 2
Unfortunately I think the easiest route will be Serialization, although, you could maybe try implementing "Sourceable" for other classifiers on your own.
Another, but perhaps inconvenient solution, would be to use Weka to build the classifier every time you use it. You wouldn't need to load the ".model" file, but you would need to load your training data and relearn the model. Here is a starters guide to building classifiers in your own java code http://weka.wikispaces.com/Use+WEKA+in+your+Java+code.
Solved the problem for myself by turning the output of WEKA's -printTrees option of the RandomForest classifier into Java source code.
http://pielot.org/2015/06/exporting-randomforest-models-to-java-source-code/
Since I am using classifiers with Android, all of the existing options had disadvantages:
shipping Android apps with serialized models didn't reliably work across devices
computing the model on the phone took too much resources
The final code will consist of three classes only: the class with the generated model + two classes to make the classification work.

Creating 3d models and converting them into a form usable by WebGL - a beginners guide?

I'm looking for some simple (free!) entry level software that will allow me to create simple 3d models and export them in a format (JSON?) that can then be read into a webGL programme.
Simple geometry would be a start, then textures would be nice too... I've looked at Blender, and it's just far too advanced for me, and the tutorials I've found are hopeless.
Something simple like sketchup would be good, but afaik you can't export in JSON. I've found some converters that will do .dae to .json, but the ones I've found seem to be for advanced users.
WebGL is new enough that there aren't many packages like this built up around it just yet. That doesn't mean you don't have some options though:
Blender is a good modeler, and if you are willing to put a little bit more time into learning it you can use exporters from Three.js or some others that are around the net. This seems to be the most popular option at the moment.
Unity 3D is more of a scene builder than a modeling app, but it has a lot of ways to get content into it and both J3D and myself have implemented exporters from it.
Maya is a great modeling tool if you have a way to get access to it (it's commercial), and has Inka to get WebGL content out.
If you want to use something like SketchUp, it should be able to export to COLLADA, which can then be imported into Blender/Unity/What have you and exported from there using one of the previous methods.
As far as formats go, there's no real standards just yet. Most of the exporters will spit out JSON, mine uses a mix of JSON and Binary for speed/size, and some will actually give you Javascript code to execute. Which format to use probably depends on what you want to do with it. I encourage you to experiment with several and see what you like and what you don't.