A Taxonomy of Solutions for Fixed-Radix FFT Algorithm
Abstract
This paper, which is of a tutorial nature, is concerned with the computation of the generic radix-R version of the fixed-radix fast Fourier transform (FFT) algorithm, where R is taken to be an arbitrary positive integer. Much of the existing technical literature on the fixed-radix FFT deals with those simple cases where the radix takes on a value of either two or four and where certain restrictions are assumed on the placement of the index mappings. To address this situation, four algorithmic variations are discussed here for dealing with the radix-R FFT, these arising from the adoption of different combinations of decimation scheme, as provided by the decimation-in-time (DIT) and decimation-in-frequency (DIF) techniques (for breaking problem down into a number of stages with each comprising multiple smaller sub problems), and data reordering scheme, as provided by the natural ordering (NAT) and digit reversal (DR) index mappings. The computational unit, denoted CUR, as required for the efficient computation of the repetitive arithmetic operations required by the radix-R transform, is described here in some detail for all four variations, noting that for the most commonly adopted radices, namely those with values of two or four, the unit is more commonly known as the ‘butterfly’ or ‘dragonfly’, respectively. A good understanding of the operation of the CUR requires an appreciation of three key concepts, namely: 1) the R-fold input/ output data set, relating to the independent output data sets produced from distinct input data sets by the CUR (which, for familiar radix-2 FFT, is more commonly known as a ‘dual node’); 2) the stride, relating to the stage-dependent address spacing of the R-fold interleaved (and thus independent) input data sets to the CUR; and 3) the number and size of distinct sets of independent CUR’s, per stage, each set requiring its own distinct set of R twiddle factors. These concepts are illustrated through the provision of detailed examples for each of the four algorithmic variations where a radix-2 transform of length four is assumed for ease of illustration. Finally, a brief account is given of those techniques – as defined over both ‘spatial’ and ‘temporal’ domains – that might be exploited in order to facilitate the efficient parallel computation of the radix-R FFT when suitably defined parallel equipment is available for its implementation.

