DM Script, why does the fourier transform of gaussian-kenel needs modulus - fft

Recently I learn DM_Script for TEM image processing
I needed Gaussian blur process and I found one whose name is 'Gaussian Blur' in http://www.dmscripting.com/recent_updates.html
This code implements Gaussian blur algorithm by multiplying the fast fourier transform(FFT) of source image by the FFT of Gaussian-kernel image and finally doing inverse fourier transform of it.
Here is the part of the code,
// Carry out the convolution in Fourier space
compleximage fftkernelimg:=realFFT(kernelimg) (-> FFT of Gaussian-kernel image)
compleximage FFTSource:=realfft(warpimg) (-> FFT of source image)
compleximage FFTProduct:=FFTSource*fftkernelimg.modulus().sqrt()
realimage invFFT:=realIFFT(FFTProduct)
The point I want to ask is this
compleximage FFTProduct:=FFTSource*fftkernelimg.modulus().sqrt()
Why does the FFT of Gaussian-kernel need '.modulus().sqrt()' for the convolution?
It is related to the fact that the fourier transform of a Gaussian function becomes another Gaussian function?
Or It is related to a sort of limitation of discrete fourier transform?
Please answer me
Thanks

This is related to the general precision limitation of any floating point numeric computing. (see f.e. here, or more in depth here)
A rotational (real-valued) Gaussian of stand.dev. sigma should be transformed into a 100% real-values rotational Gaussioan of 1/sigma. However, doing this numerically will show you deviations: Just try the following:
number sigma = 30
number A0 = 1
realimage first := RealImage( "First", 8, 256, 256 )
first = A0 * exp( - (iradius**2/(2*sigma*sigma) ))
first.showimage()
complexImage second := FFT(first)
second.Showimage()
image nonZeroImaginaryMask = ( 0 != second.Imaginary() )
nonZeroImaginaryMask.Showimage()
nonZeroImaginaryMask.SetLimits(0,1)
When you then multiply these complex images (before back-transferring) you are introducing even more errors. By using modulus, one ensures that the forward transformed kernel is purely real and hence a better "damping" curve.
A better implementation of a FFT filtering code would actually create the FFT(Gaussian) directly with a std.dev of 1/sigma, as this is the analytically correct result. Doing a FFT of the kernel only makes sense if the kernel (or its FFT) is not analytically known.
In general: When implementing any "maths" into a program code, it can pay hugely to think it through with numerical computation limits in the back of your head. Reduce actual computation whenever possible (i.e. compute analytically and use the result instead of relying on brute force numerical computation) and try to "reshape" equations when possible, f.e. avoid large sums over many small numbers, be careful about checks against exact numeric values, try to avoid expressions which are very sensitive on small numerica errors etc.

Related

Extrapolation Fourier prédiction finance

I try to do an extrapolation with the FFT method thanks to this code: https://gist.github.com/tartakynov/83f3cd8f44208a1856ce
and i would like to knw how can i select only the highest frequencies of the data serie because this code don't give me good result.
The basis vectors of an FFT are all circularly periodic, thus poorly model any edge transients or circular discontinuities between the beginning and end of a rectangular window which is the same length as the FFT (unless the full data set is purely integer periodic in FFT length, extremely unlikely for real financial data sets).
You could try windowing your data (with a non-rectangular window, such as a Von Hann, Tukey or Planck-taper window), possibly in conjunction with using a longer zero-padded FFT.
Another possibility is to use polynomial regression (perhaps somewhat underfit) on some portion of your data to generate an estimator, extend that polynomial into the region into which you wish to extrapolate, and then take the (windowed) FFT of that polynomial range instead of your original time series.

QR algorithm repeating eigenvalues

Can QR algorithm find repeat eigenvalues (https://en.wikipedia.org/wiki/QR_algorithm) ? I.e. Does it support the case when not all N eigen value for real matrix N x N are distinct?
How extend QR algorithm to support finding complex eigenvalues?
In principle yes. It will work if the eigenvalues are really all eigenvalues, i.e., the algebraic and geometric multiplicity are the same.
If the multiple eigenvalue occurs in an Jordan-block of size s, then the unavoidable floating point error during the iteration will almost surely result in a star-shaped perturbation into an eigenvalue cluster with relative error of size mu^(1/s) where mu is the machine constant of the floating point data type.
The reason this happens is that on the irreducible invariant subspace corresponding to a Jordan block of size s the characteristic polynomial of the reduction of the linear operator to this subspace has is (λ-λ[j])^s. During the computation this gets perturbed to (λ-λ[j])^s+μq(λ) which in first approximation has roots close to λ[j]+μ^(1/s)*z[k], where z[k] denotes the s roots of 0=z^s+q(λ[k]). What the perturbation function q is is quite random, accumulated floating point truncation errors, and depends on details of the method.

recovering phase of sine signal from FFT

I have a simple sine function as sin(2*pift+phi). I want to obtain the phase signal phi.
I tried to use FFT to calculate phi. In matlab I do the following
f=200; %frequency of sine wave
overSampRate=30; %oversampling rate
fs=overSampRate*f; %sampling frequency
phase = 3/5*pi; %desired phase shift in radians
nCyl = 5; %to generate five cycles of sine wave
t=0:1/fs:nCyl*1/f; %time base
x=sin(2*pi*f*t+phase); %replace with cos if a cosine wave is desired
NFFT=1024; %NFFT-point DFT
X=fft(x,NFFT); %compute DFT using FFT
XX=2*abs(X(1:NFFT/2+1));
[tt ind]=max(XX);
phase_Estimate=angle(X(ind);
This result makes almost no sense to me. For example, when phi=0.523, phase_Estimate is obtained -0.98.
Using an non-interpolated FFT result phase only works if the period of sinusoid is exactly an integer submultiple of the FFT length. In your example, the sine wave isn't integer periodic in aperture.
If not, you will need to interpolate the phase to get a better estimate. Here's one method to get an better interpolated phase:
First fftshift (rotate by N/2) the data to move the zero phase reference point to the center of the window before doing the FFT. (This is needed to keep the phase from flipping/alternating between adjacent FFT result bins. * )
Then do the FFT and estimate the frequency of the sinusoid by parabolic or, better yet, Sinc interpolation.
Then use the estimated frequency to linearly interpolate the phase between the nearest two FFT result bin phases. Update: Or better yet, use Sinc interpolation of the real and imaginary components of the FFT result separately, then use atan2 on the interpolated IQ components to get an interpolated phase.
Then use the estimated frequency and phase at the center of the window to calculate the phase at some other point, such as the beginning of the FFT window.
Also note that the phase of a sine is different from the phase of a cosine wave by pi/2. atan(im,re) returns the cosine phase.
(* as an alternative to pre-fftshit-ing the data, one could also post-flip the phase of the odd FFT result bins.)
This is actually much more difficult question to answer than it first seems.
The answer #hotpaw2 gives is completely correct and spells it out way better than any other resource I found, but it is still only an outline and it took me a few hours to put all the meat to it's bones.
In the hopes that someone else will also find the question relevant (and for future reference for myself), here is a bit more thorough explanation:
Suppose you have a (local) maximum at index ind (as in the case of question).
Step 1: try to interpolate the more precise location of the maximum by using the two surrounding values. This is well explained in many, many places such as https://www.dsprelated.com/freebooks/sasp/Quadratic_Interpolation_Spectral_Peaks.html has a good explanation for how to do that, but TL:DR version is:
delta = 0.5*(X[ind-1]-X[ind+1])/(X[ind-1]-2*X[ind]+X[ind+1])
p0 = ind+delta
with the estimated peak at p0
(If you want a more precise estimate, use log(X[ind-1]) instead, or go full out and use sinc function, but for most purposes, the delta above is sufficient)
Step 2: the tricky part: use that location to interpolate the phase.
The first instinct is to do simple linear interpolation using the delta we just found:
i0 = floor(p0); w = p0-i0; wp = 1-w
ang = wp*angle(X[i0]) + w*angle(X[i0+1])
This WILL NOT WORK for multiple reasons, most of which were outlined by #hotpaw2. The first of them is that this is not how you average angles, as they wrap modulo 2pi so 0 and 2pi should be similar. The more correct approach is to average the normalized complex numbers instead:
ang = angle(wp*X[i0]/abs(X[i0]) + w*X[i0+1]/abs(X[i0+1]))
However, this is still not correct, because if a peak is between i0 and i0+1, the phase flips 180 degrees (pi radians) there, making the average very misleading. To fix this "phase flip", you either have to (a) perform fftshift before fft (yes, in time-domain) or (b) flip the phase of every odd-indexed value of X (achieved by multiplying the complex number with -1) or (in case you are reluctant to touch the FFT as I was), you can also just (c) mock the approach (b) with the following code:
i0 = floor(p0); w = p0-i0; wp = 1-w
if (i0 % 2 == 1) { w*=-1; wp*=-1 } # Flip both if i0 odd
ang = angle(wp*X[i0]/abs(X[i0]) - w*X[i0+1]/abs(X[i0+1])) # Note the "-" here!
This will give you a (mostly) correct phase, but for a cosine and at the center of fft window.
Step 3 (Optional): If you need the phase for sine and from the beginning of the window, you need to add a correction factor:
ang_beg = ang - (2*pi*p0/N)*N/2 + pi*0.5 = ang - pi*(p0 - 0.5)
(0.5*pi converts cos to sin, and -p0*pi translates to beginning of window).
This seemed to work, at least in the Phase Vocoder I needed it in. Hopefully someone else will also find this useful.
As an aside, the phase interpolation is not needed for a pure sine wave, as angle(X[i0]) = angle(-X[i0+1]) so you can just use it directly. With actual signals, there is likely to be some deviation so interpolation adds some robustness which is usually a good idea, although using w and wp and normalizing is likely overkill and angle(sgn*(X[i0]-X[i0+1)) is usually enough.
Any comments to all this are very welcome. I am not a DSP specialist, so I may well be wrong in some details, bu this does seem to work, so hopefully someone else will also find it useful.
You're trying to get the phase from the power spectrum (XX) when you should be getting it from the FFT (X). Change:
phase_Estimate=angle(XX(ind));
to:
phase_Estimate=angle(X(ind));
It maybe late, but I changed Your script a little
f=200; %frequency of sine wave
overSampRate=30; %oversampling rate
fs=overSampRate*f; %sampling frequency
shift = 30
phase = shift*pi/180; %desired phase shift in radians
nCyl = 5; %to generate five cycles of sine wave
t=0:1/fs:nCyl*1/f; %time base
x=cos(2*pi*f*t+phase); %replace with cos if a cosine wave is desired
NFFT=4096; %NFFT-point DFT
X=fft(x,NFFT); %compute DFT using FFT
XX=2*abs(X(1:NFFT/2+1));
[tt, ind]=max(XX);
phase_Estimate = angle(X(ind)) * 360/(2*pi)
It spits out quite close results to what I would expect.
I changed the x vector generation to cosine, calculated degrees in phase_Estimate instead of radians, and made it easy to change input phase shift.

why DCT transform is preferred over other transforms in video/image compression

I went through how DCT (discrete cosine transform) is used in image and video compression standards.
But why DCT only is preferred over other transforms like dft or dst?
Because cos(0) is 1, the first (0th) coefficient of DCT-II is the mean of the values being transformed. This makes the first coefficient of each 8x8 block represent the average tone of its constituent pixels, which is obviously a good start. Subsequent coefficients add increasing levels of detail, starting with sweeping gradients and continuing into increasingly fiddly patterns, and it just so happens that the first few coefficients capture most of the signal in photographic images.
Sin(0) is 0, so the DSTs start with an offset of 0.5 or 1, and the first coefficient is a gentle mound rather than a flat plain. That is unlikely to suit ordinary images, and the result is that DSTs require more coefficients than DCTs to encode most blocks.
The DCT just happens to suit. That is really all there is to it.
When performing image compression, our best bet is to perform the KLT or the Karhunen–Loève transform as it results in the least possible mean square error between the original and the compressed image. However, KLT is dependent on the input image, which makes the compression process impractical.
DCT is the closest approximation to the KL Transform. Mostly we are interested in low frequency signals so only even component is necessary hence its computationally feasible to compute only DCT.
Also, the use of cosines rather than sine functions is critical for compression as fewer cosine functions are needed to approximate a typical signal (See Douglas Bagnall's answer for further explanation).
Another advantage of using cosines is the lack of discontinuities. In DFT, since the signal is represented periodically, when truncating representation coefficients, the signal will tend to "lose its form". In DCT, however, due to the continuous periodic structure, the signal can withstand relatively more coefficient truncation but still keep the desired shape.
The DCT of a image macroblock where the top and bottom and/or the left and right edges don't match will have less energy in the higher frequency coefficients than a DFT. Thus allowing greater opportunities for these high coefficients to be removed, more coarsely quantized or compressed, without creating more visible macroblock boundary artifacts.
DCT is preferred over DFT (Discrete Fourier Transformation) and KLT (Karhunen-Loeve Transformation)
1. Fast algorithm
2. Good energy compaction
3. Only real coefficients

Invert 4x4 matrix - Numerical most stable solution needed

I want to invert a 4x4 matrix. My numbers are stored in fixed-point format (1.15.16 to be exact).
With floating-point arithmetic I usually just build the adjoint matrix and divide by the determinant (e.g. brute force the solution). That worked for me so far, but when dealing with fixed point numbers I get an unacceptable precision loss due to all of the multiplications used.
Note: In fixed point arithmetic I always throw away some of the least significant bits of immediate results.
So - What's the most numerical stable way to invert a matrix? I don't mind much about the performance, but simply going to floating-point would be to slow on my target architecture.
Meta-answer: Is it really a general 4x4 matrix? If your matrix has a special form, then there are direct formulas for inverting that would be fast and keep your operation count down.
For example, if it's a standard homogenous coordinate transform from graphics, like:
[ux vx wx tx]
[uy vy wy ty]
[uz vz wz tz]
[ 0 0 0 1]
(assuming a composition of rotation, scale, translation matrices)
then there's an easily-derivable direct formula, which is
[ux uy uz -dot(u,t)]
[vx vy vz -dot(v,t)]
[wx wy wz -dot(w,t)]
[ 0 0 0 1 ]
(ASCII matrices stolen from the linked page.)
You probably can't beat that for loss of precision in fixed point.
If your matrix comes from some domain where you know it has more structure, then there's likely to be an easy answer.
I think the answer to this depends on the exact form of the matrix. A standard decomposition method (LU, QR, Cholesky etc.) with pivoting (an essential) is fairly good on fixed point, especially for a small 4x4 matrix. See the book 'Numerical Recipes' by Press et al. for a description of these methods.
This paper gives some useful algorithms, but is behind a paywall unfortunately. They recommend a (pivoted) Cholesky decomposition with some additional features too complicated to list here.
I'd like to second the question Jason S raised: are you certain that you need to invert your matrix? This is almost never necessary. Not only that, it is often a bad idea. If you need to solve Ax = b, it is more numerically stable to solve the system directly than to multiply b by A inverse.
Even if you have to solve Ax = b over and over for many values of b, it's still not a good idea to invert A. You can factor A (say LU factorization or Cholesky factorization) and save the factors so you're not redoing that work every time, but you'd still solve the system each time using the factorization.
You might consider doubling to 1.31 before doing your normal algorithm. It'll double the number of multiplications, but you're doing a matrix invert and anything you do is going to be pretty tied to the multiplier in your processor.
For anyone interested in finding the equations for a 4x4 invert, you can use a symbolic math package to resolve them for you. The TI-89 will do it even, although it'll take several minutes.
If you give us an idea of what the matrix invert does for you, and how it fits in with the rest of your processing we might be able to suggest alternatives.
-Adam
Let me ask a different question: do you definitely need to invert the matrix (call it M), or do you need to use the matrix inverse to solve other equations? (e.g. Mx = b for known M, b) Often there are other ways to do this w/o explicitly needing to calculate the inverse. Or if the matrix M is a function of time & it changes slowly then you could calculate the full inverse once, & there are iterative ways to update it.
If the matrix represents an affine transformation (many times this is the case with 4x4 matrices so long as you don't introduce a scaling component) the inverse is simply the transpose of the upper 3x3 rotation part with the last column negated. Obviously if you require a generalized solution then looking into Gaussian elimination is probably the easiest.