How to chose a fixed clipping_gradients value [caffe] - caffe

In caffe.proto
// Set clip_gradients to >= 0 to clip parameter gradients to that L2 norm,
// whenever their actual L2 norm is larger.
optional float clip_gradients = 35 [default = -1];
I am having trouble setting the clipping_gradient, I think it should be dynamic anyway but if we are to chose a fixed number, how should we chose it? Is caffe setting it to 35? What does it mean?? I have experimented with a number of fixed choices but I see not much of a difference. I understand the exploding gradients / gradient clipping concept in the broad sense, however I am not sure how I should chose a fixed number in the solver.

You can print out the sum of the sum squared gradients for some iteration to get an idea about clip_gradients. This can be done this way:
net_->forward();
net_->backward();
const vector<Blob<Dtype>*>& net_params = net_->learnable_params();
float sumsq_diff = 0;
for (int i = 0; i < net_params.size(); ++i) {
sumsq_diff += net_params[i]->sumsq_diff();
}
std::cout<<"sum of gradient: "<<std::sqrt(sumsq_diff)<<"\n";
net_->update();
For details about how clip_gradients is used see solver.cpp.

Related

How to interpret FFT data for making a spectrum visualizer

I am trying to visualize a spectrum where the frequency range is divided into N bars, either linearly or logarithmic. The FFT seems to work fine, but I am not sure how to interpret the values in order to decide the max height for the visualization.
I am using FMODAudio, a wrapper for C#. It's set up correctly.
In the case of a linear spectrum, the bars are defined as following:
public int InitializeSpectrum(int windowSize = 1024, int maxBars = 16)
{
numSamplesPerBar_Linear.Clear();
int barSamples = (windowSize / 2) / maxBars;
for (int i = 0; i < maxBars; ++i)
{
numSamplesPerBar_Linear.Add(barSamples);
}
IsInitialized = true;
Data = new float[numSamplesPerBar_Linear.Count];
return numSamplesPerBar_Linear.Count;
}
Data is the array which holds the spectrum values received from the update loop.
The update looks like this:
public unsafe void UpdateSpectrum(ref ParameterFFT* fftData)
{
int length = fftData->Length / 2;
if (length > 0)
{
int indexFFT = 0;
for (int index = 0; index < numSamplesPerBar_Linear.Count; ++index)
{
for (int frec = 0; frec < numSamplesPerBar_Linear[index]; ++frec)
{
for (int channel = 0; channel < fftData->ChannelCount; ++channel)
{
var floatspectrum = fftData->GetSpectrum(channel); //this is a readonlyspan<float> by default.
Data[index] += floatspectrum[indexFFT];
}
++indexFFT;
}
Data[index] /= (float)(numSamplesPerBar_Linear[index] * fftData->ChannelCount); // average of both channels for more meaningful values.
}
}
}
The values I get when testing a song are very low across the bands.
A randomly chosen moment when playing a song gives these values:
16 bars = 0,0326 0,0031 0,001 0,0003 0,0004 0,0003 0,0001 0,0002 0,0001 0,0001 0,0001 0 0 0 0 0
I realize it's more useful to use a logarithmic spectrum in many cases, and I intend to, but I still need to figure how how to find the max values for each bar so that I can setup the visualization on a proper scale.
Q: How can I know the potential max values for each bar based on this setup (it's not 1.0)?
output from FFT call is an array where each element is a complex number ( A + Bi ) where A is the real number component and B the imaginary number component ... element zero of this array represents frequency zero as in DC which is the offset bias can typically be ignored ... as you iterate across each element of this array you increment the frequency ... this freq increment is calculated using
Audio_samples <-- array of raw audio samples in PCM format which gets
fed into FFT call
num_fft_bins := float64(len(Audio_samples)) / 2.0 // using Nyquist theorem
freq_incr_per_bin := (input_audio_sample_rate / 2.0) / num_fft_bins
so to answer your question the output array from FFT call is a linear progression evenly spaced based in above freq increment constant
Depends on your input data to the FFT, and the scaling that your particular FFT implementation uses (not all FFTs use the same scale factor).
With an energy preserving forward-FFT, Parseval's theorem applies. So the energy (sum of squares) of the input vector equals the energy of the FFT result vector. Note that for a single integer periodic in aperture sinusoidal input (a pure tone), all that energy can appear in a single FFT result element. So if you know the maximum possible input energy, you can use that to compute the maximum possible result element magnitude for scaling purposes.
The range is often large enough that visualizers commonly need to use log scaling, or else typical input can get pixel quantized to a graph of all zeros.

libgdx - fixed timestep with interpolation - without box2d

I am having some problems implementing fixed timestep with graphic interpolation in my game.
Here is part of the render method:
#Override
public void render(float delta)
{
double newTime = TimeUtils.millis() / 1000.0;
double frameTime = Math.min(newTime - currentTime, 0.25);
accumulator += frameTime;
currentTime = newTime;
while (accumulator >= step)
{
updateObjects(step);
accumulator -= step;
}
double alpha = accumulator / step;
interpolateObjects((float)alpha);
}
Here is updateObjects:
for (int i = 0; i < world.level.gameObjects.size(); i++)
{
GameObject go = world.level.gameObjects.get(i);
go.prevPosition.set(go.position);//save previous position
go.update(delta);
}
interpolateObjects:
for (int i = 0; i < world.level.gameObjects.size(); i++)
{
GameObject go = world.level.gameObjects.get(i);
go.position.lerp(go.prevPosition, alpha);
}
And then objects are rendered using position
As far as i can tell this should work, but it doesn't.
On high fps (200-400) everything is too slow, movement isn't even visible, i can just see that position is changing by 0.0001 or something like that
On low fps (10-20), movement is visible but again objects are very slow...
If i disable interpolation, than everything works as it should (on any fps), but then everything is jittery.
So the problem is somewhere in interpolation.
Your interpolation go.position.lerp(go.prevPosition, alpha) is set up to assume that prevPosition was last updated at an exact multiple of step, but then when you update prevPosition like this go.prevPosition.set(go.position) you are destroying that contract on the first update of the frame. It also looks like you are lerping backwards (from position to the previous position).
I think you need a third vector so the last interpolated value is guaranteed not to influence your fixed time updates. Here I'll call it interpPosition, and it will be used for drawing instead of position.
You actually seem to technically be extrapolating (not interpolating) the value, since you are not updating ahead of time and your alpha is calculated from time left in the accumulator. If you want to linearly extrapolate from the last two positions calculated, you can do it like this (note the 1+alpha to extrapolate):
for (int i = 0; i < world.level.gameObjects.size(); i++)
{
GameObject go = world.level.gameObjects.get(i);
interpPosition.set(go.prevPosition).lerp(go.position, 1 + alpha);
}
Depending on the speed of your simulation (and how fast objects can accelerate), this might still look jerky. I think a smoother, but computationally slower way to do this would be to do a fully calculated update using alpha instead of step time, and storing that in the interpPosition vector. But only do that if necessary.

Replace the selected pixel

How do I replace the selected pixel of the image? I used set pixel and get pixel concept but not getting the desired effect.
http://www.digital-photography-school.com/wp-content/uploads/2009/07/before-after.jpg
var s_color = 0x0083C7;
color_picker.addEventListener(ColorPickerEvent.CHANGE, changeColor);
function changeColor(ColorPickerEvent)
{
var _color = color_picker.selectedColor.toString(16);
var color = String("0x"+_color);
for (var j = 0; j <m_inputImage.width; j++)
{
for (var k = 0; k < m_inputImage.height; k++)
{
if (m_inputImage.getPixel(j,k)== s_color)
{
m_inputImage.setPixel(j,k,color);
}
}
}
s_color = color;
}
I want similar type effect.
Please guide me.
This is not a job for BitmapData, you should use Pixelbender for this.
http://www.adobe.com/devnet/flash/articles/pixel_bender_basics.html
You can find all shaders here, there are a lot of hue/saturation and color manipulation filters so pick one that suits you the best.
http://www.adobe.com/cfusion/exchange/index.cfm?event=productHome&exc=26&loc=en_us
I would use Photoshop instead of Flash to achieve the effect you desire.
However Photoshop is kind of expensive, so I would use the Bitmap class in conjunction with the BitmapData class and use an algorithm to run through each pixel and check for a certain threshold of red and convert it to the right threshold of yellow. If you would post the code you already have written I could possibly add to it, I'm not going to spend the next hour writing an example though.

What does the mask parameter do in the threshold method of the BitmapData class?

I'm trying to replace a color and colors near it in a bitmap.
threshold() seems to work but it seems to be that you have to specify the exact color "==" or all colors before or after the exact color "<" & ">" plus "<=" and ">=". I am hoping that the mask parameter will help me find a way to find a color and a dynamic range of colors before and after it to be replaced. What is its intended usage?
Per the comment below Example 1 and 2:
bit.threshold(bit, bit.rect, point, ">", 0xff000000, 0xffff0000, 0x00FF0000);
bit.threshold(bit, bit.rect, point, ">", 0xff000000, 0xffff0000, 0x00EE0000);
If you're trying to do a flood fill, I don't think the mask parameter will help you. The mask parameter lets you ignore parts of the color in the test. In your case, you want to take into account all the channels of the color, you just want the matching to be fuzzy.
e.g. If you want to replace all pixels where the red component is 0, you can set mask to 0x00FF0000, so it will ignore the other channels.
The implementation pseudo-code probably looks something like this:
input = readPixel()
value = input & mask
if(value operation threshold)
{
writePixel(color)
}
Neither of your samples will produce anything because the mask limits the values to be between 0x00000000 and 0x00FF0000, then tests if they're greater than 0xFF000000.
I have also done this and eventually, I have found it best to create my own threshold-method. You can find it below. Everything is explained in comment.
//_snapshot is a bitmapData-object
for(var i:int = 0; i <= _snapshot.width; i++)
{
for(var j:int = 0; j <= _snapshot.height; j++)
{
//We get the color of the current pixel.
var _color:uint = _snapshot.getPixel(i, j);
//If the color of the selected pixel is between certain values set by the user,
//set the filtered pixel data to green.
//Threshold is a number (can be quite high, up to 50000) to look for adjacent colors in the colorspace.
//_colorToCompare is the color you want to look for.
if((_colorToCompare - (100 * _threshold)) <= _color && _color <= (_colorToCompare + (100 * _threshold)))
{
//This sets the pixel value.
_snapshot.setPixel(i, j, 0x00ff00);
}
else
{
//If the pixel color is not within the desired range, set it's value to black.
_snapshot.setPixel(i, j, 0x000000);
}
}
}

Bitmap conversion - Creating a transparent + black image from a B&W source

I have a whole bunch of jpg files that I need to use in a project, that for one reason or another cannot be altered. Each file is similar (handwriting), black pen on white BG. However I need to use these assets against a non-white background in my flash project, so I'm trying to do some client-side processing to get rid of the backgrounds using getPixel and setPixel32.
The code I am currently using currently uses a linear comparison, and while it works, the results are less than expected, as the shades of grey are getting lost in the mix. Moreso than just tweaking my parameters to get things looking proper, I get the feeling that my method for computing the RGBa value is weak.
Can anyone recommend a better solution than what I'm using below? Much appreciated!
private function transparify(data:BitmapData) : Bitmap {
// Create a new BitmapData with transparency to return
var newData:BitmapData = new BitmapData(data.width, data.height, true);
var orig_color:uint;
var alpha:Number;
var percent:Number;
// Iterate through each pixel using nested for loop
for(var x:int = 0; x < data.width; x++){
for (var y:int = 0; y < data.height; y++){
orig_color = data.getPixel(x,y);
// percent is the opacity percentage, white should be 0,
// black would be 1, greys somewhere in the middle
percent = (0xFFFFFF - orig_color)/0xFFFFFF;
// To get the alpha value, I multiply 256 possible values by
// my percentage, which gets multiplied by 0xFFFFFF to fit in the right
// value for the alpha channel
alpha = Math.round(( percent )*256)*0xFFFFFF;
// Adding the alpha value to the original color should give me the same
// color with an alpha channel added
var newCol = orig_color+alpha;
newData.setPixel32(x,y,newCol);
}
}
var newImg:Bitmap = new Bitmap(newData);
return newImg;
}
Since it's a white background, blendMode may give you a better result.