Compute Library
 19.08
CLFFT1D Class Reference

Basic function to execute one dimensional FFT. More...

#include <CLFFT1D.h>

Collaboration diagram for CLFFT1D:
[legend]

Public Member Functions

 CLFFT1D (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default Constructor. More...
 
void configure (const ICLTensor *input, ICLTensor *output, const FFT1DInfo &config)
 Initialise the function's source, destinations and border mode. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *output, const FFT1DInfo &config)
 Static function to check if given info will lead to a valid configuration of CLFFT1D. More...
 

Detailed Description

Basic function to execute one dimensional FFT.

This function calls the following OpenCL kernels:

  1. CLFFTDigitReverseKernel Performs digit reverse.
  2. CLFFTRadixStageKernel A list of FFT kernels depending on the radix decomposition.
  3. CLFFTScaleKernel Performs output scaling in case of in inverse FFT.

Definition at line 47 of file CLFFT1D.h.

Constructor & Destructor Documentation

◆ CLFFT1D()

CLFFT1D ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default Constructor.

Definition at line 33 of file CLFFT1D.cpp.

34  : _memory_group(std::move(memory_manager)), _digit_reverse_kernel(), _fft_kernels(), _scale_kernel(), _digit_reversed_input(), _digit_reverse_indices(), _num_ffts(0), _run_scale(false)
35 {
36 }

Member Function Documentation

◆ configure()

void configure ( const ICLTensor input,
ICLTensor output,
const FFT1DInfo config 
)

Initialise the function's source, destinations and border mode.

Parameters
[in]inputSource tensor. Data types supported: F32.
[out]outputDestination tensor. Data types and data layouts supported: Same as input.
[in]configFFT related configuration

Definition at line 38 of file CLFFT1D.cpp.

39 {
40  ARM_COMPUTE_ERROR_ON_NULLPTR(input, output);
41  ARM_COMPUTE_ERROR_THROW_ON(CLFFT1D::validate(input->info(), output->info(), config));
42 
43  // Decompose size to radix factors
44  const auto supported_radix = CLFFTRadixStageKernel::supported_radix();
45  const unsigned int N = input->info()->tensor_shape()[config.axis];
46  const auto decomposed_vector = arm_compute::helpers::fft::decompose_stages(N, supported_radix);
47  ARM_COMPUTE_ERROR_ON(decomposed_vector.empty());
48 
49  // Flags
50  _run_scale = config.direction == FFTDirection::Inverse;
51  const bool is_c2r = input->info()->num_channels() == 2 && output->info()->num_channels() == 1;
52 
53  // Configure digit reverse
54  FFTDigitReverseKernelInfo digit_reverse_config;
55  digit_reverse_config.axis = config.axis;
56  digit_reverse_config.conjugate = config.direction == FFTDirection::Inverse;
57  TensorInfo digit_reverse_indices_info(TensorShape(input->info()->tensor_shape()[config.axis]), 1, DataType::U32);
58  _digit_reverse_indices.allocator()->init(digit_reverse_indices_info);
59  _memory_group.manage(&_digit_reversed_input);
60  _digit_reverse_kernel.configure(input, &_digit_reversed_input, &_digit_reverse_indices, digit_reverse_config);
61 
62  // Create and configure FFT kernels
63  unsigned int Nx = 1;
64  _num_ffts = decomposed_vector.size();
65  _fft_kernels.resize(_num_ffts);
66  for(unsigned int i = 0; i < _num_ffts; ++i)
67  {
68  const unsigned int radix_for_stage = decomposed_vector.at(i);
69 
70  FFTRadixStageKernelInfo fft_kernel_info;
71  fft_kernel_info.axis = config.axis;
72  fft_kernel_info.radix = radix_for_stage;
73  fft_kernel_info.Nx = Nx;
74  fft_kernel_info.is_first_stage = (i == 0);
75  _fft_kernels[i].configure(&_digit_reversed_input, ((i == (_num_ffts - 1)) && !is_c2r) ? output : nullptr, fft_kernel_info);
76 
77  Nx *= radix_for_stage;
78  }
79 
80  // Configure scale kernel
81  if(_run_scale)
82  {
83  FFTScaleKernelInfo scale_config;
84  scale_config.scale = static_cast<float>(N);
85  scale_config.conjugate = config.direction == FFTDirection::Inverse;
86  is_c2r ? _scale_kernel.configure(&_digit_reversed_input, output, scale_config) : _scale_kernel.configure(output, nullptr, scale_config);
87  }
88 
89  // Allocate tensors
90  _digit_reversed_input.allocator()->allocate();
91  _digit_reverse_indices.allocator()->allocate();
92 
93  // Init digit reverse indices
94  const auto digit_reverse_cpu = arm_compute::helpers::fft::digit_reverse_indices(N, decomposed_vector);
95  _digit_reverse_indices.map(CLScheduler::get().queue(), true);
96  std::copy_n(digit_reverse_cpu.data(), N, reinterpret_cast<unsigned int *>(_digit_reverse_indices.buffer()));
97  _digit_reverse_indices.unmap(CLScheduler::get().queue());
98 }
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:41
std::vector< unsigned int > decompose_stages(unsigned int N, const std::set< unsigned int > &supported_factors)
Decompose a given 1D input size using the provided supported factors.
Definition: fft.cpp:34
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const FFT1DInfo &config)
Static function to check if given info will lead to a valid configuration of CLFFT1D.
Definition: CLFFT1D.cpp:100
std::vector< unsigned int > digit_reverse_indices(unsigned int N, const std::vector< unsigned int > &fft_stages)
Calculate digit reverse index vector given fft size and the decomposed stages.
Definition: fft.cpp:79
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:337
CLTensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: CLTensor.cpp:55
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
void configure(ICLTensor *input, ICLTensor *output, const FFTScaleKernelInfo &config)
Set the input and output tensors.
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
void map(bool blocking=true)
Enqueue a map operation of the allocated buffer.
Definition: CLTensor.cpp:60
uint8_t * buffer() const override
Interface to be implemented by the child class to return a pointer to CPU memory.
Definition: ICLTensor.cpp:53
void manage(TensorType *obj)
Sets a object to be managed by the given memory group.
1 channel, 1 U32 per channel
void configure(const ICLTensor *input, ICLTensor *output, const ICLTensor *idx, const FFTDigitReverseKernelInfo &config)
Set the input and output tensors.
static std::set< unsigned int > supported_radix()
Returns the radix that are support by the FFT kernel.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
void unmap()
Enqueue an unmap operation of the allocated and mapped buffer.
Definition: CLTensor.cpp:65

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, FFT1DInfo::axis, FFTDigitReverseKernelInfo::axis, FFTRadixStageKernelInfo::axis, ICLTensor::buffer(), CLFFTScaleKernel::configure(), CLFFTDigitReverseKernel::configure(), FFTScaleKernelInfo::conjugate, FFTDigitReverseKernelInfo::conjugate, arm_compute::helpers::fft::decompose_stages(), arm_compute::helpers::fft::digit_reverse_indices(), FFT1DInfo::direction, CLScheduler::get(), ITensor::info(), ITensorAllocator::init(), arm_compute::Inverse, FFTRadixStageKernelInfo::is_first_stage, MemoryGroupBase< TensorType >::manage(), CLTensor::map(), ITensorInfo::num_channels(), FFTRadixStageKernelInfo::Nx, FFTRadixStageKernelInfo::radix, FFTScaleKernelInfo::scale, CLFFTRadixStageKernel::supported_radix(), ITensorInfo::tensor_shape(), arm_compute::U32, CLTensor::unmap(), and CLFFT1D::validate().

Referenced by CLFFT2D::configure(), and arm_compute::test::validation::DATA_TEST_CASE().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 125 of file CLFFT1D.cpp.

126 {
127  MemoryGroupResourceScope scope_mg(_memory_group);
128 
129  // Run digit reverse
130  CLScheduler::get().enqueue(_digit_reverse_kernel, false);
131 
132  // Run radix kernels
133  for(unsigned int i = 0; i < _num_ffts; ++i)
134  {
135  CLScheduler::get().enqueue(_fft_kernels[i], i == (_num_ffts - 1) && !_run_scale);
136  }
137 
138  // Run output scaling
139  if(_run_scale)
140  {
141  CLScheduler::get().enqueue(_scale_kernel, true);
142  }
143 }
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:41
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
Definition: CLScheduler.cpp:95

References CLScheduler::enqueue(), and CLScheduler::get().

Referenced by CLFFT2D::run().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo output,
const FFT1DInfo config 
)
static

Static function to check if given info will lead to a valid configuration of CLFFT1D.

Parameters
[in]inputSource tensor info. Data types supported: F32.
[in]outputDestination tensor info. Data types and data layouts supported: Same as input.
[in]configFFT related configuration
Returns
a status

Definition at line 100 of file CLFFT1D.cpp.

101 {
103  ARM_COMPUTE_RETURN_ERROR_ON(input->data_type() != DataType::F32);
104  ARM_COMPUTE_RETURN_ERROR_ON(input->num_channels() != 1 && input->num_channels() != 2);
105  ARM_COMPUTE_RETURN_ERROR_ON(std::set<unsigned int>({ 0, 1 }).count(config.axis) == 0);
106 
107  // Check if FFT is decomposable
108  const auto supported_radix = CLFFTRadixStageKernel::supported_radix();
109  const unsigned int N = input->tensor_shape()[config.axis];
110  const auto decomposed_vector = arm_compute::helpers::fft::decompose_stages(N, supported_radix);
111  ARM_COMPUTE_RETURN_ERROR_ON(decomposed_vector.empty());
112 
113  // Checks performed when output is configured
114  if((output != nullptr) && (output->total_size() != 0))
115  {
116  ARM_COMPUTE_RETURN_ERROR_ON(output->num_channels() == 1 && input->num_channels() == 1);
117  ARM_COMPUTE_RETURN_ERROR_ON(output->num_channels() != 1 && output->num_channels() != 2);
120  }
121 
122  return Status{};
123 }
std::vector< unsigned int > decompose_stages(unsigned int N, const std::set< unsigned int > &supported_factors)
Decompose a given 1D input size using the provided supported factors.
Definition: fft.cpp:34
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:244
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES(...)
Definition: Validate.h:443
static std::set< unsigned int > supported_radix()
Returns the radix that are support by the FFT kernel.
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, FFT1DInfo::axis, ITensorInfo::data_type(), arm_compute::helpers::fft::decompose_stages(), arm_compute::F32, ITensorInfo::num_channels(), CLFFTRadixStageKernel::supported_radix(), ITensorInfo::tensor_shape(), and ITensorInfo::total_size().

Referenced by CLFFT1D::configure(), and CLFFT2D::validate().


The documentation for this class was generated from the following files: