I'm trying to achieve an algorithm to efficiently calculate a combination of convolution and correlation such as following :
c(x,y)=(sum of i, (sum of j, a(x-i,y+j)*b(i,j)))
I have known that 1-D convolution or correlation can be solved by
a conv b = ifft(fft(a).*fft(b))
a corr b = ifft(fft(a).*conjg(fft(b)))
But I have no idea about the combination of them in 2-D or N-D problems. I think it is similar to 2-D convolution, but I don't know the specific deduction process.
The correlation can be written in terms of the convolution by reversing one of the arguments:
corr(x(t),y(t)) = conv(x(t),y(-t))
Thus, if you want the x-axis to behave like a convolution and the y-axis to behave like a correlation, reverse the y-axis only and compute the convolution. It doesn’t matter if you use a spatial or frequency domain implementation.
How to write two column vectors as an analytic convolution so that the discrete FFT may be used. MATLAB syntax is used.
Consider:
a set of vectors which, when sorted into a step function appears as any of the following:
[1,1,1,1,0,0,0,0], or [1,1,1,1,1,0,0,0], or [1,1,1,1,1,1,0,0]
(...the location at which the function "steps up" varies over members of this set)
The other is random vec=[1,0,1,0,1,1,1,0], and obviously both contain only 0s and 1s.
Is it possible to write these vectors as an analytic convolution? I would like the 1st, 2nd, 3rd, 4th... entries of the convolution to have values of:
sum(vec.*[1,0,0,0,0,0,0,0])
sum(vec.*[1,1,0,0,0,0,0,0])
sum(vec.*[1,1,1,0,0,0,0,0])
sum(vec.*[1,1,1,1,0,0,0,0])
...
sum(vec.*[1,1,1,1,1,1,1,1])
For speed, I am trying to avoid use of a for-loop. I cannot vectorize because this requires terabytes of RAM. (I work with vectors that are not of length 8, but rather length nearly a million).
The convolution theorem gives the function R from the convolution of functions L and 1/w from the Fourier transform F and its inverse F-1 as,
Clearly, the function 1/(w-w') in the convolution is from 1/w under F; it's as if you just set w'=0. But if I use analogous reasoning in my [1,1,1,1,0,0,0,0], I get either [1,1,1,1,1,1,1,1], the identity under .* in MATLAB or [0,0,0,0,0,0,0,0](a very boring result).
What is the mistake in reasoning I've made?
Given an RGB image of hand and 3d position of the keypoints of the hand as dataset, I want to do this as regression problem in DL. In this case input will be the RGB image, and output should be estimated 3d position of keypoints.
I have seen some info about regression but most of them are trying to estimate one single value. Is it possible to estimate multiple values(or output) all at once?
For now I have referred to this code. This guy is trying to estimate the age of a person in the image.
The output vector from a neural net can represent anything as long as you define loss function well. Say you want to detect (x,y,z) co-ordinates of 10 keypoints, then just have 30 element long output vector say (x1,y1,z1,x2,y2,z2..............,x10,y10,z10), where xi,yi,zi denote coordinates of ith keypoint, basically you can use any order you feel convenient with. Just be careful with your loss function. Say you want to calculate RMSE loss, you would have to extract tripes correctly and then calculate RMSE loss for each keypoint, or if you are fimiliar with linear algebra, just reshape it into a 3x10 matrix correctly and and have your results also as a 3x10 matrix and then just use
loss = tf.sqrt(tf.reduce_mean(tf.squared_difference(Y1, Y2)))
But once you have formulated your net you will have to stick to it.
Recently I learn DM_Script for TEM image processing
I needed Gaussian blur process and I found one whose name is 'Gaussian Blur' in http://www.dmscripting.com/recent_updates.html
This code implements Gaussian blur algorithm by multiplying the fast fourier transform(FFT) of source image by the FFT of Gaussian-kernel image and finally doing inverse fourier transform of it.
Here is the part of the code,
// Carry out the convolution in Fourier space
compleximage fftkernelimg:=realFFT(kernelimg) (-> FFT of Gaussian-kernel image)
compleximage FFTSource:=realfft(warpimg) (-> FFT of source image)
compleximage FFTProduct:=FFTSource*fftkernelimg.modulus().sqrt()
realimage invFFT:=realIFFT(FFTProduct)
The point I want to ask is this
compleximage FFTProduct:=FFTSource*fftkernelimg.modulus().sqrt()
Why does the FFT of Gaussian-kernel need '.modulus().sqrt()' for the convolution?
It is related to the fact that the fourier transform of a Gaussian function becomes another Gaussian function?
Or It is related to a sort of limitation of discrete fourier transform?
Please answer me
Thanks
This is related to the general precision limitation of any floating point numeric computing. (see f.e. here, or more in depth here)
A rotational (real-valued) Gaussian of stand.dev. sigma should be transformed into a 100% real-values rotational Gaussioan of 1/sigma. However, doing this numerically will show you deviations: Just try the following:
number sigma = 30
number A0 = 1
realimage first := RealImage( "First", 8, 256, 256 )
first = A0 * exp( - (iradius**2/(2*sigma*sigma) ))
first.showimage()
complexImage second := FFT(first)
second.Showimage()
image nonZeroImaginaryMask = ( 0 != second.Imaginary() )
nonZeroImaginaryMask.Showimage()
nonZeroImaginaryMask.SetLimits(0,1)
When you then multiply these complex images (before back-transferring) you are introducing even more errors. By using modulus, one ensures that the forward transformed kernel is purely real and hence a better "damping" curve.
A better implementation of a FFT filtering code would actually create the FFT(Gaussian) directly with a std.dev of 1/sigma, as this is the analytically correct result. Doing a FFT of the kernel only makes sense if the kernel (or its FFT) is not analytically known.
In general: When implementing any "maths" into a program code, it can pay hugely to think it through with numerical computation limits in the back of your head. Reduce actual computation whenever possible (i.e. compute analytically and use the result instead of relying on brute force numerical computation) and try to "reshape" equations when possible, f.e. avoid large sums over many small numbers, be careful about checks against exact numeric values, try to avoid expressions which are very sensitive on small numerica errors etc.
I have 3 arrays of X, Y and Z. Each have 8 elements. Now for each possible combination of (X,Y,Z) I have a V value.
I am looking to find a formula e.g. V=f(X,Y,Z). Any idea about how that can be done?
Thank you in advance,
Astry
You have a function sampled on a (possibly nonuniform) 3D grid, and want to evaluate the function at any arbitrary point within the volume. One way to approach this (some say the best) is as a multivariate spline evaluation. https://en.wikipedia.org/wiki/Multivariate_interpolation
First, you need to find which rectangular parallelepiped contains the (x,y,z) query point, then you need to interpolate the value from the nearest points. The easiest thing is to use trilinear interpolation from the nearest 8 points. If you want a smoother surface, you can use quadratic interpolation from 27 points or cubic interpolation from 64 points.
For repeated queries of a tricubic spline, your life would be a bit easier by preprocessing the spline to generate Hermite patches/volumes, where your sample points not only have the function value, but also its derivatives (∂/∂x, ∂/∂y, ∂/∂z). That way you don't need messy code for the boundaries at evaluation time.