GradientMagnitudeImageFilter in ITK - itk

What does GradientMagnitudeImageFilter do in ITK (i.e. what is the formulation)?
Is it just a sobel filter? is the equivalent in skimage filters.sobel? The ITK documentation is somewhat sparse on how it works.

It is not just a Sobel filter, which also exists in ITK. Unlike Sobel, it takes spacing into account. ITK has many more edge related filters.

Related

What is the best way to implement STFT (Short-time Fourier transform) in Julia

So, I'm wondering how to implement a STFT in Julia, possibly using a Hamming window. I can't find anything on the internet.
What is the best way to do it? I'd prefer not to use Python libraries, but pure native Julia if possible. Maybe it's a feature still being developed in Juila...?
Thanks!
I'm not aware of any native STFT implementation in Julia. As David stated in his comment, you will have to implement this yourself. This is fairly straightforward:
1) Break up your signal into short time sections
2) Apply your window of choice (in your case, Hamming)
3) Take the FFT using Julia's fft function.
All of the above are fairly standard mathematical operations and you will find a lot of references online. The below code generates a Hamming window if you don't have one already (watch out for Julia's 1-indexing when using online references as a lot of the signal processing reference material likes to use 0-indexing when describing window generation).
Wb = Array(Float64, N)
## Hamming Window Generation
for n in 1:N
Wb[n] = 0.54-0.46cos(2pi*(n-1)/N)
end
You could also use the Hamming window function from DSP.jl.
P.S If you are running Julia on OS X, check out the Julia interface to Apple's Accelerate framework. This provides a very fast Hamming window implementation, as well as convolution and elementwise multiplication functions that might be helpful.

using FFTW compatablity mode in cuFFT

I have a full project created using FFTW. I want to transition to using cuFFT. I understand that cuFFT has a "compatibility mode". But how exactly does this work? The cuFFT manual says:
After an application is working using the FFTW3 interface, users may
want to modify their code to move data to and from the GPU and use the
routines documented in the FFTW Conversion Guide for the best
performance.
Does this mean I actually need to change my individual function calls? For example, call
cufftPlan1d() instead of fftw_plan_dft_1d().
Do I also have to change my data types?
fftw_complex *inputData; // fftw data storage gets replaced..
cufft_complex *inputData; // ... by cufft data storage?
fftw_plan forwardFFT; // fftw plan gets replaced...
cufftHandle forwardFFT; // ... by cufft plan?
If I'm going to have to rewrite all of my code, what is the point of cufftSetCompatabilityMode(.)?
Probably what you want is the cuFFTW interface to cuFFT. I suggest you read this documentation as it probably is close to what you have in mind. This will allow you to use cuFFT in a FFTW application with a minimum amount of changes. As indicated in the documentation, there should only be two steps requred:
It is recommended that you replace the include file fftw3.h with cufftw.h
Instead of linking with the double/single precision libraries such as fftw3/fftw3f libraries, link with both the CUFFT and CUFFTW libraries
Regarding the doc item you excerpted, that step (moving the data explicitly) is not required if you're just using the cuFFTW compatibility interface. However, you may not achieve maximum performance this way. If you want to achieve maximum performance, you may need to use cuFFT natively, for example so that you can explicitly manage data movement. Whether or not this is important will depend on the specific structure of your application (how many FFT's you are doing, and whether any data is shared amongst multiple FFTs, for example.) If you intend to use cuFFT natively, then the following comments apply:
Yes, you need to change your individual function calls. They must line up with function names in the API, associated header files, and library. The fftw_ function names are not in the cuFFT library.
You can inspect your data types and should discover that for the basic data types like float, double, complex, etc. they should be layout-compatible between cuFFT and FFTW. Personally I would recommend changing your data types to cuFFT data types, but there should be no functional or performance difference at this time.
Although you don't mention it, cuFFT will also require you to move the data between CPU/Host and GPU, a concept that is not relevant for FFTW.
Regarding cufftSetCompatibilityMode, the function documentation and discussion of FFTW compatibility mode is pretty clear on it's purpose. It has to do with overall data layout, especially padding of data for FFTW.
Check this link out :
Here, it says that all we need to do is change the linking.
https://developer.nvidia.com/blog/cudacasts-episode-8-accelerate-fftw-apps-cufft-55/

CUDA: 1-dimensional cubic spline interpolation in CUDA

I'm making a medical imaging equipment. I want to use CUDA for making faster equipment
I receive 1024 size 1d data from CCD 512 times.
before I perform IFFT
I have to apply high performance interpolation algorithm (like cubic spline interpolation)
to the 1024 size data each (then 1d interpolation 512 times).
Is there any CUDA library to perform cubic spline interpolation?
(I found that there is one library, but it is for 2 or 3 dimensional image.
Since I need to perform other complicated filtering functions, I need the data on the global memory, not on the texture memory.)
Is there any NUFFT (non uniform fast Fourier transform) library (doesn't need to be written for CUDA)?
I'm thinking that if I have NUFFT function, I don't have to do interpolation and IFFT separately which is possible for making even faster equipment.
Since more people have asked this, I have extended my CUDA cubic interpolation code with 1D cubic interpolation as well. You can find the updated code here: http://www.dannyruijters.nl/cubicinterpolation/
A working CUDA example that also contains 1D cubic interpolation can be found in the cudaAccuracyTest sample in the examples subdirectory in CI.zip.
For those of you who are more interested in a SSE approach, I have some working SSE optimized multi-threaded cubic interpolation code (albeit in 3D, not 1D) in the referenceCubicTexture3D sample in the examples subdirectory.
edit: The cubic interpolation code is now available on github. The 1D cubic interpolation code is here.
Regarding #1
Ruijters' bi/tricubic spline interpolation, which is I think what you refer to http://dannyruijters.nl/cubicinterpolation, (edited!) now works with 1D data, thank you! See Danny Ruijters' answer on this page.
Regarding #2
Here're a few NUFFT implementations that I'm aware of, and brief thoughts on them.
The first library mentioned by #ardiyu07, Greengard, et alia's implementation of fast Gaussian gridding, is in Fortran, which I don't know and so I didn't look at this for long (though this does offer type-III nonuniform-to-nonuniform transforms).
The second one is Ferrara's implementation of Greengard's algorithm in Matlab/MEX, and I couldn't get it to give me the correct solution (see my comment to that effect on Mathworks FileExchange, which I just posted).
Potts, et al., http://www-user.tu-chemnitz.de/~potts/nfft/ I couldn't get this to compile in Windows so I gave up on it. It also has type-III NUFFTs.
Fessler, et al., http://web.eecs.umich.edu/~fessler/code/ written in Matlab/MEX and pre-compiled binaries provided for Linux and Windows at least. Definitely written by non-professional programmers, but it's the only one of the 4 that I've gotten to work correctly. I even got it to work in GNU Octave after changing their Matlab source code in a handful of places (basically by seeing where Octave errors were raised), since Octave can use pre-compiled MEX binaries. This also uses a different algorithm than Greengard's or Potts', based on min-max criteria (its solutions are guaranteed to minimize the maximum DFT error), but lacks type-III NUFFTs (only types-I and II: one of the domains has to be uniform).
I believe a fifth NUFFT/"gridding" implementation is by Hargreaves, et al.: http://www-mrsrl.stanford.edu/~brian/gridding/ (paper at http://dx.doi.org/10.1109/TMI.2005.848376). It is in Matlab/MEX. As is, it is not as general-purpose as the previous four on this list, as it's very much embedded in its MRI context.
And here' a sixth implementation, in Cython (fast Python), with type-III nonuniform-to-nonuniform transforms and some other nice features, alas under GPL: https://github.com/mrbell/gfft
I'm working, at a glacial pace, on porting Fessler's algorithm to Python/Cython, and maybe CUDA ("maybe" because just zero-padding the standard (CU)FFT and linear interpolation seems to work well enough). Best of luck.
I don't know about that algorithm, but if what you've found you think fast enough for your equipment, then why dont you change the implementation from using texture memory to just a simple array, and maybe you can do more speedup using shared memory?
I've found some written in matlab and fortran 77:
http://www.cims.nyu.edu/cmcl/nufft/nufft.html
http://www.mathworks.com/matlabcentral/fileexchange/25135-nufft-nufft-usffft
To be honest, your parallelism seems to be a bit low for the GPU. A 6core with SSE optimizations might outperform a GPU here.

Cosine in floating point

I am trying to implement the cosine and sine functions in floating point (but I have no floating point hardware).
Since my processor has no floating-point hardware, nor instructions, I have already implemented algorithms for floating point multiplication, division, addition, subtraction, and square root. So those are the tools I have available to me to implement cosine and sine.
I was considering using the CORDIC method, at this site
However, I implemented division and square root using newton's method, so I was hoping to use the most efficient method.
Please don't tell me to just go look in a book or that "paper's exist", no kidding they exist. I am looking for names of well known algorithms that are known to be fast and efficient.
First off, depending on your accuracy requirements, this can be considerably fussier than your earlier questions.
Now that you've been warned: you'll first want to reduce the argument modulo pi/2 (or 2pi, or pi, or pi/4) to get the input into a manageable range. This is the subtle part. For a nice discussion of the issues involved, download a copy of K.C. Ng's ARGUMENT REDUCTION FOR HUGE ARGUMENTS: Good to the Last Bit. (simple google search on the title will get you a pdf). It's very readable, and does a great job of describing why this is tricky.
After doing that, you only need to approximate the functions on a small range around zero, which is easily done via a polynomial approximation. A taylor series will work, though it is inefficient. A truncated chebyshev series is easy to compute and reasonably efficient; computing the minimax approximation is better still. This is the easy part.
I have implemented sine and cosine exactly as described, entirely in integer, in the past (sorry, no public sources). Using hand-tuned assembly, results in the neighborhood of 100 cycles are entirely reasonable on "typical" processors. I don't know what hardware you're dealing with (the performance will mostly be gated on how quickly your hardware can produce the high part of an integer multiply).
For various levels of precision, you can find some good approximations here:
http://www.ganssle.com/approx.htm
With the added advantage that they are deterministic in runtime unlike the various "converging series" options which can vary wildly depending on the input value. This matters if you are doing anything real-time (games, motion control etc.)
Since you have the basic arithmetic operations implemented, you may as well implement sine and cosine using their taylor series expansions.

Using ARPACK with PARDISO

This is a question similar to a question here.
I wonder is there already an open source implementation or example of ARPACK Eigensolver that works well with PARDISO solver and Intel mkl library?
I've implemented it already, in C#.
The idea is that one must convert the matrix format in CSR format. Then, one can use MKL to compute linear equation solving algorithm ( using pardiso solver), the matrix-vector manipulation.