Overview

The AIE API offers theaie::fft_ditclass template that provides the building blocks to implement all the stages of the FFT decimation-in-time computation. This class template is parametrized with the data type, the radix to be used and the vectorization of the FFT stage to be computed. The resulting class defines a stage iterator that contains the state of the computation (pointers to the input and twiddle arrays), and a function that performs the decimation in time.

By default the input and output type are the same, but you can optionally specify a second type for a different output type.

The following snippet shows the sample implementation of a radix-2 FFT stage for any vectorization value.

      template<
      unsignedVectorization>
     
      voidradix2_dit(
      constcint32 * __restrict x,
     
      constcint16 * __restrict tw,
     
      unsignedn,
      unsignedshift_tw,
      unsigned
      shift,
      bool
      inv,
     
      cint32 * __restrict y)
     
      {
     
      usingFFT =
      aie::fft_dit;
      // radix = 2, input_type = cint32, output_type = cint32 (output defaults to input)
     
      FFT fft;
     
      autoit_stage = fft.begin_stage(x, tw);
     
      autoit_out0 = aie::begin_vector(y);
     
      autoit_out1 = aie::begin_vector(y + n / 2);
     
      for(
      intj = 0; j < n / (2 * FFT::out_vector_size); ++j)
     
      chess_prepare_for_pipelining
     
      chess_loop_range(1,)
     
      {
     
      const
      autoout = fft.dit(*it_stage++, shift_tw,
      shift,
      inv);
     
      *it_out0++ = out[0];
     
      *it_out1++ = out[1];
     
      }
     
      }

Typedefs
template
using	aie::fft_dit=detail::fft_dit< Vectorization, detail::fft_get_stage< Vectorization, Radix, T1, T2 >(), Radix, T1, T2 >
	More...

Supported Fast Fourier Transform Modes

Supported FFT/IFFT modes
Data type	Radix
c16b	2
c32b	2
c32b	4
cfloat	2

For a given radix R with n number of points, there will be R number of outputs spaced by n/R. The above example shows 2 outputs for a radix 2, for radix 4 we would have:

       template<
       unsignedVectorization>
      
       voidradix4_dit(
       constcint32 * __restrict x,
      
       constcint16 * __restrict tw1,
      
       constcint16 * __restrict tw2,
      
       unsignedn,
       unsignedshift_tw,
       unsigned
       shift,
       bool
       inv,
      
       cint32 * __restrict y)
      
       {
      
       usingFFT =
       aie::fft_dit;
      
       FFT fft;
      
       autoit_stage = fft.begin_stage(x, tw1, tw2);
      
       autoit_out0 = aie::begin_restrict_vector(y);
      
       autoit_out1 = aie::begin_restrict_vector(y + n / 4);
      
       autoit_out2 = aie::begin_restrict_vector(y + 2*n / 4);
      
       autoit_out3 = aie::begin_restrict_vector(y + 3*n / 4);
      
       chess_report(FFT::out_vector_size);
      
       for(
       intj = 0; j < n / (4 * FFT::out_vector_size); ++j)
      
       chess_prepare_for_pipelining
      
       chess_loop_range(1,)
      
       {
      
       const
       autoout = fft.dit(*it_stage++, shift_tw,
       shift,
       inv);
      
       *it_out0++ = out[0];
      
       *it_out1++ = out[1];
      
       *it_out2++ = out[2];
      
       *it_out3++ = out[3];
      
       }
      
       }

However you can also use less with different incrementation schemes to achieve the same result. This approach is currently recommended for optimal performance on c32b radix 4 for Vectorization > 1 due to internal pointer requirements:

       template<
       unsignedVectorization>
      
       voidradix4_dit(
       constcint32 * __restrict x,
      
       constcint16 * __restrict tw1,
      
       constcint16 * __restrict tw2,
      
       unsignedn,
       unsignedshift_tw,
       unsigned
       shift,
       bool
       inv,
      
       cint32 * __restrict y)
      
       {
      
       usingFFT =
       aie::fft_dit;
      
       FFT fft;
      
       autoit_stage = fft.begin_stage(x, tw1, tw2);
      
       autoit_out0 = aie::begin_restrict_vector(y);
      
       autoit_out1 = aie::begin_restrict_vector(y + n / 2);
      
       for(
       intj = 0; j < n / (4 * FFT::out_vector_size); ++j)
      
       chess_prepare_for_pipelining
      
       chess_loop_range(1,)
      
       {
      
       const
       autoout = fft.dit(*it_stage++, shift_tw,
       shift,
       inv);
      
       *it_out0 = out[0]; it_out0 += n/16;
      
       *it_out0 = out[1]; it_out0 += -(int)(n/16)+1;
      
       *it_out1 = out[2]; it_out1 += n/16;
      
       *it_out1 = out[3]; it_out1 += -(int)(n/16)+1;
      
       }
      
       }

Going to/from cint16/cint32 is also supported. Here is an example of a radix 2 downscaling from cint32 to cint16:

       template<
       unsignedVectorization>
      
       voidradix2_down_dit(
       const
       cint32_t* __restrict x,
      
       const
       cint16_t* __restrict tw,
      
       intn,
       intshift_tw,
       int
       shift,
       bool
       inv,
      
       cint16_t* __restrict y)
      
       {
      
       usingFFT =
       aie::fft_dit;
      
       FFT fft;
      
       autoit_stage = fft.begin_stage(x, tw);
      
       autoit_out0 = aie::begin_vector(y);
      
       autoit_out1 = aie::begin_vector(y + n / 2);
      
       for(
       intj = 0; j < n / (2 * FFT::out_vector_size); ++j)
      
       chess_prepare_for_pipelining
      
       chess_loop_range(1,)
      
       {
      
       const
       autoout = fft.dit(*it_stage++, shift_tw,
       shift,
       inv);
      
       *it_out0++ = out[0];
      
       *it_out1++ = out[1];
      
       }
      
       }

Typedef Documentation

◆fft_dit

template

usingaie::fft_dit= typedefdetail::fft_dit(), Radix, T1, T2>

Type that encapsulates the functionality for decimation-in-time FFTs.

Template Parameters

Vectorization	Vectorization of the FFT stage
Radix	Number which selects the FFT radix.
T1	Type of the input elements.
T2	Type of the output elements, defaults to input type.

Overview

Typedefs

Supported Fast Fourier Transform Modes

Typedef Documentation

◆fft_dit