학술논문

QTIB: Quick bit-reversed permutations on CPUs
Document Type
Conference
Author
Source
2011 17th International Conference on Digital Signal Processing (DSP) Digital Signal Processing (DSP), 2011 17th International Conference on. :1-6 Jul, 2011
Subject
Signal Processing and Analysis
Communication, Networking and Broadcast Technologies
Computing and Processing
Prefetching
Indexes
Registers
Bandwidth
Layout
Graphics processing unit
Language
ISSN
1546-1874
2165-3577
Abstract
We present a fast algorithm for out-of-place bit-reversed permutation of large vectors for input to an FFT. It is an extension of two previously published methods with special consideration of advanced CPU hardware features. In particular, the method makes heavy use of cache prefetching, MMX and SSE units, and write-combining buffers. Implementations have been made in assembly language for 2-byte and 4-byte operands. In terms of efficiency the method significantly outperforms previously reported methods.