Fast Computation Of General Fourier Transforms On Gpus
2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4(2008)
摘要
We present an implementation of general FFTs for graphics processing units (GPUs). Unlike most existing GPU FFT implementations, we handle both complex and real data of any size that can fit in a texture. The basic building block for our algorithms is a radix-2 Stockham formulation of the FFT for power-of-two data sizes that avoids expensive bit reversals and exploits the high GPU memory bandwidth efficiently. We implemented our algorithms using the DirectX 9 API, which enables our routines to be used on many of the existing GPUs today. We have performed comparisons against optimized CPU-based and GPU-based FFT libraries (Intel Math Kernel Library and NVIDIA CUFFT, respectively). Our results on a NVIDIA GeForce 8800 GTX GPU indicate a significant performance improvement over the existing libraries for many input cases.
更多查看译文
关键词
graphics hardware, FFT, GPGPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络