torchvision faster rcnn roihead code
why do we need append ground-truth bboxes to propos in faster rcnn roihead?
I didn't find the answer to the question
i see more about faster rcnn explain and i can't not find about that
Related
I have LSTM model I'm using for time series predictions. In training it converges already after 3 epochs. The model performs quite well on the test data, but should I still be concerned about the fast convergence or should performance on test set be the overruling factor to decide if a model is good or not?
There is plenty of data points(100k) and two hidden layers with 124 and 64 nodes, so I don't think the model lacks complexity or data.
I read from several sources that implicitly suggest batchnorm being turned off for inference but I have no definite answer for this.
Most common is to use a moving average of mean and std for your batch normalization as used by Keras for example (https://github.com/keras-team/keras/blob/master/keras/layers/normalization.py). If you just turn it off the network will perform worse on the same data, due to changes in how the images are processed.
This is done by storing the average mean and the average std of all the batches used during training the network. Then in inference this moving average is used for normalization.
I have a CNN that is learning pretty well on a dataset I created. I added Batch Normalization to this network to try to improve the performances.
But .. when I try to make a prediction on a single image I always end up with the same result (whatever the image). I think it is because I need batches to actually do batch normalization.
So is it possible to do a prediction on a single image with a CNN using BN ?
I thought about deleting BN layers once my network is done training, is it the way to go ?
Thank you :)
I found the exact answer and the problem I face here : https://r2rt.com/implementing-batch-normalization-in-tensorflow.html
in the "Making predictions with the model" it is explained that when using BN, during training time you need to estimate the population mean and population variance on your training set so you don't have to use batch when doing testing (which would be "cheating") :)
I have a 512x512 image. I want to perform per-pixel classification of that image. I have already trained the model using 80x80 patches. So, at the test time I have 512x512=262144 patches each with dimension 80x80 and this classification is too slow. How to improve the testing time? Please help me out.
I might be wrong, but there are not a lot of solution to speed up the testing phase, the main one being to reduce the NN number of neurons in order to reduce the number of operations:
80x80 patches are really big, you may want to try to reduce their size and retrain your NN. It will already reduces a lot the number of neurons.
Analyze the NN weights/inputs/outputs in order to detect the neurons that do not matter in your NN. They may for example always return 0, then they can be deleted from your NN. Then you retrain your NN with the simplified architecture.
If you have not done that already, it's much faster to give a batch (the bigger the better) of patches instead of one patch at a time.
I am working on a CFD project and I am using the new CUDA 5 libary "cusparse" to solve a system of linear equations. I tested the sample code "conjugateGradientPrecond". The result show that Preconditioned Gradient using ILU took more time to get the final answer than Conjugate gradient without preconditioning. The former algorithm do need less iteration, but it take to much time on "cusparseScsrsv_solve", so the overall time is longer.
Here is my question, is there any other preconditioned conjugate Gradient that can greatly decrease the iteration while don't include any time-consuming function like "cusparseScsrsv_solve"?
Preconditioning techniques such as ILU0/ILUT, IC0/ICT will require to solve a triangular system two times at each iteration of CG (upper and lower decomposition of the preconditioning matrix). By nature, solving triangular systems is a sequential problem, but for case of sparse matrices some analysis phase can be performed to find some degrees of parallelization (refer to this post). In general, for sparse systems, one can not offer the best preconditioning technique, but the simple diagonal (aka Jacobi) preconditioning, imposes negligible overhead and offers high level of parallelization for GPU implementation.