Compute Library
 21.02
CLFFT1D Class Reference

Basic function to execute one dimensional FFT. More...

#include <CLFFT1D.h>

Collaboration diagram for CLFFT1D:
[legend]

Public Member Functions

 CLFFT1D (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default Constructor. More...
 
 CLFFT1D (const CLFFT1D &)=delete
 Prevent instances of this class from being copied. More...
 
CLFFT1Doperator= (const CLFFT1D &)=delete
 Prevent instances of this class from being copied. More...
 
 CLFFT1D (CLFFT1D &&)=default
 Default move constructor. More...
 
CLFFT1Doperator= (CLFFT1D &&)=default
 Default move assignment operator. More...
 
 ~CLFFT1D ()
 Default destructor. More...
 
void configure (const ICLTensor *input, ICLTensor *output, const FFT1DInfo &config)
 Initialise the function's source, destinations and border mode. More...
 
void configure (const CLCompileContext &compile_context, const ICLTensor *input, ICLTensor *output, const FFT1DInfo &config)
 Initialise the function's source, destinations and border mode. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *output, const FFT1DInfo &config)
 Static function to check if given info will lead to a valid configuration of CLFFT1D. More...
 

Detailed Description

Basic function to execute one dimensional FFT.

This function calls the following OpenCL kernels:

  1. CLFFTDigitReverseKernel Performs digit reverse.
  2. CLFFTRadixStageKernel A list of FFT kernels depending on the radix decomposition.
  3. CLFFTScaleKernel Performs output scaling in case of in inverse FFT.

Definition at line 47 of file CLFFT1D.h.

Constructor & Destructor Documentation

◆ CLFFT1D() [1/3]

CLFFT1D ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default Constructor.

Definition at line 36 of file CLFFT1D.cpp.

References CLFFT1D::~CLFFT1D().

37  : _memory_group(std::move(memory_manager)),
38  _digit_reverse_kernel(std::make_unique<CLFFTDigitReverseKernel>()),
39  _fft_kernels(),
40  _scale_kernel(std::make_unique<CLFFTScaleKernel>()),
41  _digit_reversed_input(),
42  _digit_reverse_indices(),
43  _num_ffts(0),
44  _run_scale(false)
45 {
46 }

◆ CLFFT1D() [2/3]

CLFFT1D ( const CLFFT1D )
delete

Prevent instances of this class from being copied.

◆ CLFFT1D() [3/3]

CLFFT1D ( CLFFT1D &&  )
default

Default move constructor.

◆ ~CLFFT1D()

~CLFFT1D ( )
default

Default destructor.

Referenced by CLFFT1D::CLFFT1D().

Member Function Documentation

◆ configure() [1/2]

void configure ( const ICLTensor input,
ICLTensor output,
const FFT1DInfo config 
)

Initialise the function's source, destinations and border mode.

Parameters
[in]inputSource tensor. Data types supported: F16/F32.
[out]outputDestination tensor. Data types and data layouts supported: Same as input.
[in]configFFT related configuration

Definition at line 50 of file CLFFT1D.cpp.

References CLKernelLibrary::get().

Referenced by CLFFT2D::configure().

51 {
52  configure(CLKernelLibrary::get().get_compile_context(), input, output, config);
53 }
void configure(const ICLTensor *input, ICLTensor *output, const FFT1DInfo &config)
Initialise the function&#39;s source, destinations and border mode.
Definition: CLFFT1D.cpp:50
static CLKernelLibrary & get()
Access the KernelLibrary singleton.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
const ICLTensor input,
ICLTensor output,
const FFT1DInfo config 
)

Initialise the function's source, destinations and border mode.

Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. Data types supported: F16/F32.
[out]outputDestination tensor. Data types and data layouts supported: Same as input.
[in]configFFT related configuration

Definition at line 55 of file CLFFT1D.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, FFTDigitReverseKernelInfo::axis, FFT1DInfo::axis, FFTRadixStageKernelInfo::axis, ICLTensor::buffer(), FFTScaleKernelInfo::conjugate, FFTDigitReverseKernelInfo::conjugate, arm_compute::helpers::fft::decompose_stages(), arm_compute::helpers::fft::digit_reverse_indices(), FFT1DInfo::direction, CLScheduler::get(), ITensor::info(), ITensorAllocator::init(), arm_compute::Inverse, FFTRadixStageKernelInfo::is_first_stage, MemoryGroup::manage(), CLTensor::map(), N, ITensorInfo::num_channels(), FFTRadixStageKernelInfo::Nx, FFTRadixStageKernelInfo::radix, FFTScaleKernelInfo::scale, CLFFTRadixStageKernel::supported_radix(), ITensorInfo::tensor_shape(), arm_compute::U32, CLTensor::unmap(), and CLFFT1D::validate().

56 {
58  ARM_COMPUTE_ERROR_THROW_ON(CLFFT1D::validate(input->info(), output->info(), config));
59 
60  // Decompose size to radix factors
61  const auto supported_radix = CLFFTRadixStageKernel::supported_radix();
62  const unsigned int N = input->info()->tensor_shape()[config.axis];
63  const auto decomposed_vector = arm_compute::helpers::fft::decompose_stages(N, supported_radix);
64  ARM_COMPUTE_ERROR_ON(decomposed_vector.empty());
65 
66  // Flags
67  _run_scale = config.direction == FFTDirection::Inverse;
68  const bool is_c2r = input->info()->num_channels() == 2 && output->info()->num_channels() == 1;
69 
70  // Configure digit reverse
71  FFTDigitReverseKernelInfo digit_reverse_config;
72  digit_reverse_config.axis = config.axis;
73  digit_reverse_config.conjugate = config.direction == FFTDirection::Inverse;
74  TensorInfo digit_reverse_indices_info(TensorShape(input->info()->tensor_shape()[config.axis]), 1, DataType::U32);
75  _digit_reverse_indices.allocator()->init(digit_reverse_indices_info);
76  _memory_group.manage(&_digit_reversed_input);
77  _digit_reverse_kernel->configure(compile_context, input, &_digit_reversed_input, &_digit_reverse_indices, digit_reverse_config);
78 
79  // Create and configure FFT kernels
80  unsigned int Nx = 1;
81  _num_ffts = decomposed_vector.size();
82  _fft_kernels.reserve(_num_ffts);
83  for(unsigned int i = 0; i < _num_ffts; ++i)
84  {
85  const unsigned int radix_for_stage = decomposed_vector.at(i);
86 
87  FFTRadixStageKernelInfo fft_kernel_info;
88  fft_kernel_info.axis = config.axis;
89  fft_kernel_info.radix = radix_for_stage;
90  fft_kernel_info.Nx = Nx;
91  fft_kernel_info.is_first_stage = (i == 0);
92  _fft_kernels.emplace_back(std::make_unique<CLFFTRadixStageKernel>());
93  _fft_kernels.back()->configure(compile_context, &_digit_reversed_input, ((i == (_num_ffts - 1)) && !is_c2r) ? output : nullptr, fft_kernel_info);
94 
95  Nx *= radix_for_stage;
96  }
97 
98  // Configure scale kernel
99  if(_run_scale)
100  {
101  FFTScaleKernelInfo scale_config;
102  scale_config.scale = static_cast<float>(N);
103  scale_config.conjugate = config.direction == FFTDirection::Inverse;
104  is_c2r ? _scale_kernel->configure(compile_context, &_digit_reversed_input, output, scale_config) : _scale_kernel->configure(output, nullptr, scale_config);
105  }
106 
107  // Allocate tensors
108  _digit_reversed_input.allocator()->allocate();
109  _digit_reverse_indices.allocator()->allocate();
110 
111  // Init digit reverse indices
112  const auto digit_reverse_cpu = arm_compute::helpers::fft::digit_reverse_indices(N, decomposed_vector);
113  _digit_reverse_indices.map(CLScheduler::get().queue(), true);
114  std::copy_n(digit_reverse_cpu.data(), N, reinterpret_cast<unsigned int *>(_digit_reverse_indices.buffer()));
115  _digit_reverse_indices.unmap(CLScheduler::get().queue());
116 }
static CLScheduler & get()
Access the scheduler singleton.
std::vector< unsigned int > decompose_stages(unsigned int N, const std::set< unsigned int > &supported_factors)
Decompose a given 1D input size using the provided supported factors.
Definition: fft.cpp:34
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const FFT1DInfo &config)
Static function to check if given info will lead to a valid configuration of CLFFT1D.
Definition: CLFFT1D.cpp:118
std::vector< unsigned int > digit_reverse_indices(unsigned int N, const std::vector< unsigned int > &fft_stages)
Calculate digit reverse index vector given fft size and the decomposed stages.
Definition: fft.cpp:79
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
CLTensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: CLTensor.cpp:61
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
unsigned int N
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
void map(bool blocking=true)
Enqueue a map operation of the allocated buffer.
Definition: CLTensor.cpp:66
uint8_t * buffer() const override
Interface to be implemented by the child class to return a pointer to CPU memory. ...
Definition: ICLTensor.cpp:53
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
1 channel, 1 U32 per channel
static std::set< unsigned int > supported_radix()
Returns the radix that are support by the FFT kernel.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
void unmap()
Enqueue an unmap operation of the allocated and mapped buffer.
Definition: CLTensor.cpp:71

◆ operator=() [1/2]

CLFFT1D& operator= ( const CLFFT1D )
delete

Prevent instances of this class from being copied.

◆ operator=() [2/2]

CLFFT1D& operator= ( CLFFT1D &&  )
default

Default move assignment operator.

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 143 of file CLFFT1D.cpp.

References CLScheduler::enqueue(), and CLScheduler::get().

Referenced by CLFFT2D::run().

144 {
145  MemoryGroupResourceScope scope_mg(_memory_group);
146 
147  // Run digit reverse
148  CLScheduler::get().enqueue(*_digit_reverse_kernel, false);
149 
150  // Run radix kernels
151  for(unsigned int i = 0; i < _num_ffts; ++i)
152  {
153  CLScheduler::get().enqueue(*_fft_kernels[i], i == (_num_ffts - 1) && !_run_scale);
154  }
155 
156  // Run output scaling
157  if(_run_scale)
158  {
159  CLScheduler::get().enqueue(*_scale_kernel, true);
160  }
161 }
static CLScheduler & get()
Access the scheduler singleton.
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo output,
const FFT1DInfo config 
)
static

Static function to check if given info will lead to a valid configuration of CLFFT1D.

Parameters
[in]inputSource tensor info. Data types supported: F16/F32.
[in]outputDestination tensor info. Data types and data layouts supported: Same as input.
[in]configFFT related configuration
Returns
a status

Definition at line 118 of file CLFFT1D.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, FFT1DInfo::axis, arm_compute::helpers::fft::decompose_stages(), arm_compute::F16, arm_compute::F32, N, ITensorInfo::num_channels(), CLFFTRadixStageKernel::supported_radix(), ITensorInfo::tensor_shape(), and ITensorInfo::total_size().

Referenced by CLFFT1D::configure(), arm_compute::test::validation::DATA_TEST_CASE(), and CLFFT2D::validate().

119 {
122  ARM_COMPUTE_RETURN_ERROR_ON(input->num_channels() != 1 && input->num_channels() != 2);
123  ARM_COMPUTE_RETURN_ERROR_ON(std::set<unsigned int>({ 0, 1 }).count(config.axis) == 0);
124 
125  // Check if FFT is decomposable
126  const auto supported_radix = CLFFTRadixStageKernel::supported_radix();
127  const unsigned int N = input->tensor_shape()[config.axis];
128  const auto decomposed_vector = arm_compute::helpers::fft::decompose_stages(N, supported_radix);
129  ARM_COMPUTE_RETURN_ERROR_ON(decomposed_vector.empty());
130 
131  // Checks performed when output is configured
132  if((output != nullptr) && (output->total_size() != 0))
133  {
134  ARM_COMPUTE_RETURN_ERROR_ON(output->num_channels() == 1 && input->num_channels() == 1);
135  ARM_COMPUTE_RETURN_ERROR_ON(output->num_channels() != 1 && output->num_channels() != 2);
138  }
139 
140  return Status{};
141 }
std::vector< unsigned int > decompose_stages(unsigned int N, const std::set< unsigned int > &supported_factors)
Decompose a given 1D input size using the provided supported factors.
Definition: fft.cpp:34
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
unsigned int N
1 channel, 1 F16 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
static std::set< unsigned int > supported_radix()
Returns the radix that are support by the FFT kernel.
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES(...)
Definition: Validate.h:443
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_NOT_IN(t,...)
Definition: Validate.h:694

The documentation for this class was generated from the following files: