Basic function to execute FFT-based convolution on OpenCL. More...

#include <CLFFTConvolutionLayer.h>

Collaboration diagram for CLFFTConvolutionLayer:

Public Member Functions
	CLFFTConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
	Default constructor. More...

	CLFFTConvolutionLayer (const CLFFTConvolutionLayer &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	CLFFTConvolutionLayer (CLFFTConvolutionLayer &&)=default
	Default move constructor. More...

CLFFTConvolutionLayer &	operator= (const CLFFTConvolutionLayer &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

CLFFTConvolutionLayer &	operator= (CLFFTConvolutionLayer &&)=default
	Default move assignment operator. More...

void	configure (ICLTensor input, const ICLTensor weights, const ICLTensor biases, ICLTensor output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
	Set the input and output tensors. More...

void	configure (const CLCompileContext &compile_context, ICLTensor input, const ICLTensor weights, const ICLTensor biases, ICLTensor output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
	Set the input and output tensors. More...

void	run () override
	Run the kernels contained in the function. More...

void	prepare () override
	Prepare the function for executing. More...

Public Member Functions inherited from IFunction
virtual	~IFunction ()=default
	Destructor. More...

Static Public Member Functions
static Status	validate (const ITensorInfo input, const ITensorInfo weights, const ITensorInfo biases, const ITensorInfo output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
	Static function to check if given info will lead to a valid configuration of CLFFTConvolutionLayer. More...

Detailed Description

Basic function to execute FFT-based convolution on OpenCL.

This function calls the following OpenCL functions/kernels:

CLPermute Permute input if NHWC(only NCHW is supported).
CLPadLayer Pad input.
CLFFT2D Forward transform to the frequency domain.
CLComplexPixelWiseMultiplication Complex element-wise product of input and the weights.
CLReductionOperation Reduction across channels.
CLFFT2D Inverse transform back to the time domain.
CLStridedSlice Extract valid output.
CLArithmeticAddition Add bias.
CLActivationLayer Perform activation.
CLPermute Permute output if NHWC(only NCHW is supported).

Definition at line 59 of file CLFFTConvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLFFTConvolutionLayer() [1/3]

CLFFTConvolutionLayer ( std::shared_ptr< IMemoryManager > memory_manager = nullptr )

Default constructor.

Definition at line 63 of file CLFFTConvolutionLayer.cpp.

     : _memory_group(memory_manager),
       _flip_weights_func(),
       _permute_input_func(),
       _permute_output_func(),
       _permute_weights_func(),
       _permute_bias_func(),
       _pad_input_func(),
       _pad_weights_func(),
       _transform_input_func(memory_manager),
       _transform_weights_func(),
       _itransform_output_func(memory_manager),
       _prod_func(),
       _reduce_func(),
       _extract_output_func(),
       _bias_add_func(),
       _activation_layer_func(),
       _permuted_input(),
       _permuted_weights(),
       _permuted_bias(),
       _permuted_output(),
       _padded_input(),
       _padded_weights(),
       _flip_axis(),
       _flipped_weights(),
       _transformed_input(),
       _transformed_weights(),
       _input_weights_product(),
       _output_product(),
       _output_reduced(),
       _itransformed_output(),
       _reshaped_output(),
       _bias_output(),
       _original_weights(nullptr),
       _original_bias(nullptr),
       _is_activationlayer_enabled(false),
       _needs_permute(false),
       _has_bias(false),
       _is_prepared(false)
 {
 }

◆ CLFFTConvolutionLayer() [2/3]

CLFFTConvolutionLayer ( const CLFFTConvolutionLayer & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLFFTConvolutionLayer() [3/3]

CLFFTConvolutionLayer ( CLFFTConvolutionLayer && )

default

Default move constructor.

Member Function Documentation

◆ configure() [1/2]

void configure	(	ICLTensor *	input,
		const ICLTensor *	weights,
		const ICLTensor *	biases,
		ICLTensor *	output,
		const PadStrideInfo &	conv_info,
		const ActivationLayerInfo &	act_info = `ActivationLayerInfo()`,
		bool	enable_fast_math = `false`
	)

Set the input and output tensors.

Note: : This function only works with any square kernel size and unit strides for both NCHW and NHWC data layout

Parameters

[in]	input	Source tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]	weights	Weights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as `input`.
[in]	biases	Biases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as `input`
[out]	output	Destination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as `input`.
[in]	conv_info	Contains padding and stride information described in PadStrideInfo.
[in]	act_info	(Optional) Activation layer information in case of a fused activation.
[in]	enable_fast_math	(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false

Definition at line 105 of file CLFFTConvolutionLayer.cpp.

References CLKernelLibrary::get().

 {
     configure(CLKernelLibrary::get().get_compile_context(), input, weights, biases, output, conv_info, act_info, enable_fast_math);
 }

◆ configure() [2/2]

void configure	(	const CLCompileContext &	compile_context,
		ICLTensor *	input,
		const ICLTensor *	weights,
		const ICLTensor *	biases,
		ICLTensor *	output,
		const PadStrideInfo &	conv_info,
		const ActivationLayerInfo &	act_info = `ActivationLayerInfo()`,
		bool	enable_fast_math = `false`
	)

Set the input and output tensors.

Note: : This function only works with any square kernel size and unit strides for both NCHW and NHWC data layout

Parameters

[in]	compile_context	The compile context to be used.
[in]	input	Source tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]	weights	Weights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as `input`.
[in]	biases	Biases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as `input`
[out]	output	Destination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as `input`.
[in]	conv_info	Contains padding and stride information described in PadStrideInfo.
[in]	act_info	(Optional) Activation layer information in case of a fused activation.
[in]	enable_fast_math	(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false

Definition at line 111 of file CLFFTConvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_UNUSED, arm_compute::auto_init_if_empty(), ICLTensor::buffer(), ICloneable< T >::clone(), TensorInfo::clone(), CLReverse::configure(), CLPermute::configure(), CLFFT2D::configure(), CLActivationLayer::configure(), CLPadLayer::configure(), CLReductionOperation::configure(), CLArithmeticAddition::configure(), CLSlice::configure(), CLComplexPixelWiseMultiplication::configure(), arm_compute::test::validation::conv_info, ITensorInfo::data_layout(), FFT2DInfo::direction, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, ITensor::info(), CLTensor::info(), ITensorAllocator::init(), arm_compute::test::validation::input, arm_compute::Inverse, MemoryGroup::manage(), CLTensor::map(), arm_compute::NCHW, arm_compute::NHWC, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), TensorShape::remove_dimension(), TensorInfo::set_data_layout(), arm_compute::SUM, ITensorInfo::tensor_shape(), TensorInfo::tensor_shape(), arm_compute::U, arm_compute::U32, CLTensor::unmap(), CLFFTConvolutionLayer::validate(), arm_compute::WIDTH, arm_compute::WRAP, Dimensions< T >::x(), and Dimensions< T >::y().

 {
     ARM_COMPUTE_UNUSED(enable_fast_math);
     ARM_COMPUTE_ERROR_THROW_ON(CLFFTConvolutionLayer::validate(input->info(), weights->info(), biases != nullptr ? biases->info() : nullptr, output->info(), conv_info, act_info, enable_fast_math));
 
     _original_weights = weights;
     _original_bias    = biases;
 
     // Flat if bias addition is required
     _has_bias = biases != nullptr;
 
     // Get indices for the width and height
     const size_t idx_width  = get_data_layout_dimension_index(input->info()->data_layout(), DataLayoutDimension::WIDTH);
     const size_t idx_height = get_data_layout_dimension_index(input->info()->data_layout(), DataLayoutDimension::HEIGHT);
 
     // Input shape, kernel size and output tile
     const Size2D input_dims  = Size2D(input->info()->tensor_shape()[idx_width], input->info()->tensor_shape()[idx_height]);
     const Size2D kernel_size = Size2D(weights->info()->tensor_shape()[idx_width], weights->info()->tensor_shape()[idx_height]);
     const Size2D pad_valid   = Size2D(pad_decomposable(input_dims.x() + kernel_size.x() - 1),
                                       pad_decomposable(input_dims.y() + kernel_size.y() - 1));
     // Tensors to use
     ICLTensor       *input_to_use   = input;
     const ICLTensor *weights_to_use = weights;
     ICLTensor       *output_to_use  = _has_bias ? &_bias_output : output;
 
     // Permute bias
     if(biases != nullptr)
     {
         _permute_bias_func.configure(compile_context, biases, &_permuted_bias, PermutationVector(1U, 2U, 0U));
         _permuted_bias.info()->set_data_layout(DataLayout::NCHW);
     }
 
     // Permute input if needed
     _needs_permute = input->info()->data_layout() == DataLayout::NHWC;
     if(_needs_permute)
     {
         _memory_group.manage(&_permuted_input);
         // Configure the function to transform the input tensor from NHWC -> NCHW
         _permute_input_func.configure(compile_context, input, &_permuted_input, PermutationVector(1U, 2U, 0U));
         _permuted_input.info()->set_data_layout(DataLayout::NCHW);
 
         // Configure the function to transform the weights tensor from HWI -> IHW
         _permute_weights_func.configure(compile_context, weights, &_permuted_weights, PermutationVector(1U, 2U, 0U));
         _permuted_weights.info()->set_data_layout(DataLayout::NCHW);
 
         input_to_use   = &_permuted_input;
         weights_to_use = &_permuted_weights;
     }
 
     // Flip weights
     _flipped_weights.allocator()->init(weights_to_use->info()->clone()->set_is_resizable(true).reset_padding());
     _flip_axis.allocator()->init(TensorInfo(TensorShape(2U), 1, DataType::U32));
     _flip_weights_func.configure(compile_context, weights_to_use, &_flipped_weights, &_flip_axis);
 
     // Pad weights
     const PaddingList padding_w = { { 0, input_dims.x() + pad_valid.x() - 1 }, { 0, input_dims.y() + pad_valid.y() - 1 } };
     _pad_weights_func.configure(compile_context, &_flipped_weights, &_padded_weights, padding_w);
 
     // Transform weights
     _transform_weights_func = std::make_unique<CLFFT2D>();
     _transform_weights_func->configure(compile_context, &_padded_weights, &_transformed_weights, FFT2DInfo());
 
     // Pad input
     const PaddingList padding_in = { { 0, kernel_size.x() + pad_valid.x() - 1 }, { 0, kernel_size.y() + pad_valid.y() - 1 } };
     _memory_group.manage(&_padded_input);
     _pad_input_func.configure(compile_context, input_to_use, &_padded_input, padding_in);
     if(_needs_permute)
     {
         _permuted_input.allocator()->allocate();
     }
 
     // Transform input
     _memory_group.manage(&_transformed_input);
     _transform_input_func.configure(compile_context, &_padded_input, &_transformed_input, FFT2DInfo());
     _padded_input.allocator()->allocate();
 
     // Perform product
     _memory_group.manage(&_output_product);
     _prod_func.configure(compile_context, &_transformed_input, &_transformed_weights, &_output_product);
     _transformed_input.allocator()->allocate();
 
     // Perform reduction
     _memory_group.manage(&_output_reduced);
     _reduce_func.configure(compile_context, &_output_product, &_output_reduced, 2, ReductionOperation::SUM);
     _output_product.allocator()->allocate();
 
     // Transform output
     _memory_group.manage(&_itransformed_output);
     FFT2DInfo itranform_info;
     itranform_info.direction = FFTDirection::Inverse;
     _itransformed_output.allocator()->init(_output_reduced.info()->clone()->set_is_resizable(true).set_num_channels(1).reset_padding());
     _itransform_output_func.configure(compile_context, &_output_reduced, &_itransformed_output, itranform_info);
     _output_reduced.allocator()->allocate();
 
     // Reshape output
     TensorShape reshaped_shape = _itransformed_output.info()->tensor_shape();
     reshaped_shape.remove_dimension(2);
     _reshaped_output.allocator()->init(_itransformed_output.info()->clone()->set_tensor_shape(reshaped_shape));
 
     // Extract correct region
     const int start_left = kernel_size.x() - conv_info.pad_left() - 1;
     const int start_top  = kernel_size.y() - conv_info.pad_top() - 1;
     const int end_right  = _reshaped_output.info()->tensor_shape().x() - (kernel_size.x() - conv_info.pad_right() - 1) - pad_valid.x();
     const int end_botton = _reshaped_output.info()->tensor_shape().y() - (kernel_size.y() - conv_info.pad_bottom() - 1) - pad_valid.y();
     if(_has_bias)
     {
         _memory_group.manage(&_bias_output);
     }
     else if(_needs_permute)
     {
         output_to_use = &_permuted_output;
         _memory_group.manage(&_permuted_output);
     }
     _extract_output_func.configure(compile_context, &_reshaped_output, output_to_use, Coordinates(start_left, start_top), Coordinates(end_right, end_botton));
     _itransformed_output.allocator()->allocate();
 
     // Add bias
     if(biases != nullptr)
     {
         output_to_use = output;
         if(_needs_permute)
         {
             output_to_use = &_permuted_output;
             _memory_group.manage(&_permuted_output);
         }
         auto_init_if_empty(*output_to_use->info(), *_bias_output.info());
         _bias_add_func.configure(compile_context, &_bias_output, &_permuted_bias, output_to_use, ConvertPolicy::WRAP);
         _bias_output.allocator()->allocate();
     }
 
     // Permute output
     if(_needs_permute)
     {
         // Configure the function to transform the convoluted output to ACL's native ordering format NCHW
         _permuted_output.info()->set_data_layout(DataLayout::NCHW);
         _permute_output_func.configure(compile_context, &_permuted_output, output, PermutationVector(2U, 0U, 1U));
 
         // Allocate tensors
         _permuted_output.allocator()->allocate();
     }
 
     // Configure Activation Layer
     _is_activationlayer_enabled = act_info.enabled();
     if(_is_activationlayer_enabled)
     {
         _activation_layer_func.configure(compile_context, output, nullptr, act_info);
     }
 
     // Setup flip axis data
     _flip_axis.allocator()->allocate();
     _flip_axis.map(true);
     auto axis_data = reinterpret_cast<uint32_t *>(_flip_axis.buffer());
     axis_data[0]   = 0;
     axis_data[1]   = 1;
     _flip_axis.unmap();
 }

◆ operator=() [1/2]

CLFFTConvolutionLayer& operator= ( const CLFFTConvolutionLayer & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

CLFFTConvolutionLayer& operator= ( CLFFTConvolutionLayer && )

default

Default move assignment operator.

◆ prepare()

void prepare ( )

overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note: Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 352 of file CLFFTConvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, CLTensorAllocator::free(), CLScheduler::get(), ITensor::is_used(), ITensor::mark_as_unused(), CLScheduler::queue(), ICLSimpleFunction::run(), CLPermute::run(), and CLPadLayer::run().

Referenced by CLFFTConvolutionLayer::run().

 {
     if(!_is_prepared)
     {
         // Permute bias to NCHW
         if(_original_bias != nullptr)
         {
             _permuted_bias.allocator()->allocate();
             _permute_bias_func.run();
             _original_bias->mark_as_unused();
         }
 
         const ICLTensor *cur_weights = _original_weights;
         // Permute weights
         if(_needs_permute)
         {
             ARM_COMPUTE_ERROR_ON(!cur_weights->is_used());
 
             _permuted_weights.allocator()->allocate();
             _permute_weights_func.run();
             cur_weights->mark_as_unused();
             cur_weights = &_permuted_weights;
         }
 
         // Flip weights
         _flipped_weights.allocator()->allocate();
         _flip_weights_func.run();
         cur_weights->mark_as_unused();
 
         // Pad weights
         _padded_weights.allocator()->allocate();
         _pad_weights_func.run();
         _flipped_weights.mark_as_unused();
         CLScheduler::get().queue().finish();
         _flipped_weights.allocator()->free();
 
         // Transform weights to frequency domain
         _transformed_weights.allocator()->allocate();
         _transform_weights_func->run();
         _padded_weights.mark_as_unused();
         CLScheduler::get().queue().finish();
         // Delete object and release internal memory
         _transform_weights_func.reset();
         _padded_weights.allocator()->free();
 
         _is_prepared = true;
     }
 }

◆ run()

void run ( )

overridevirtual

Run the kernels contained in the function.

For Neon kernels:

Multi-threading is used for the kernels which are parallelisable.
By default std::thread::hardware_concurrency() threads are used.

Note: CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

All the kernels are enqueued on the queue associated with CLScheduler.
The queue is then flushed.

Note: The function will not block until the kernels are executed. It is the user's responsibility to wait.; Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 313 of file CLFFTConvolutionLayer.cpp.

References CLTensor::allocator(), CLTensor::cl_buffer(), CLTensorAllocator::import_memory(), CLFFTConvolutionLayer::prepare(), CLFFT2D::run(), CLPermute::run(), CLActivationLayer::run(), CLReductionOperation::run(), CLPadLayer::run(), CLArithmeticAddition::run(), CLSlice::run(), and CLComplexPixelWiseMultiplication::run().

 {
     prepare();
 
     MemoryGroupResourceScope scope_mg(_memory_group);
 
     // Transform input
     if(_needs_permute)
     {
         _permute_input_func.run();
     }
     _pad_input_func.run();
     _transform_input_func.run();
 
     // Perform operations to frequency domain
     _prod_func.run();
     _reduce_func.run();
 
     // Transform output
     _itransform_output_func.run();
     _reshaped_output.allocator()->import_memory(_itransformed_output.cl_buffer());
     _extract_output_func.run();
     // Add bias
     if(_has_bias)
     {
         _bias_add_func.run();
     }
     if(_needs_permute)
     {
         _permute_output_func.run();
     }
 
     // Run activation layer
     if(_is_activationlayer_enabled)
     {
         _activation_layer_func.run();
     }
 }

◆ validate()

Status validate	(	const ITensorInfo *	input,
		const ITensorInfo *	weights,
		const ITensorInfo *	biases,
		const ITensorInfo *	output,
		const PadStrideInfo &	conv_info,
		const ActivationLayerInfo &	act_info = `ActivationLayerInfo()`,
		bool	enable_fast_math = `false`
	)

static

Static function to check if given info will lead to a valid configuration of CLFFTConvolutionLayer.

Note: : This function only works with any square kernel size and unit strides for both NCHW and NHWC data layout

Parameters

[in]	input	Source tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]	weights	Weights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as `input`.
[in]	biases	Biases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as `input`
[out]	output	Destination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as `input`.
[in]	conv_info	Contains padding and stride information described in PadStrideInfo.
[in]	act_info	(Optional) Activation layer information in case of a fused activation.
[in]	enable_fast_math	(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false

Returns: a status

Definition at line 269 of file CLFFTConvolutionLayer.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ON_ERROR, ITensorInfo::data_layout(), ITensorInfo::data_type(), ActivationLayerInfo::enabled(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), ITensorInfo::total_size(), CLActivationLayer::validate(), arm_compute::WIDTH, and Dimensions< T >::x().

Referenced by CLFFTConvolutionLayer::configure(), CLConvolutionLayer::get_convolution_method(), and CLConvolutionLayer::validate().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(input, 1, DataType::F16, DataType::F32);
     ARM_COMPUTE_RETURN_ERROR_ON((input->data_type() == DataType::F16) && !enable_fast_math);
     ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, weights);
 
     // Get indices for the width and height
     const size_t idx_width  = get_data_layout_dimension_index(input->data_layout(), DataLayoutDimension::WIDTH);
     const size_t idx_height = get_data_layout_dimension_index(input->data_layout(), DataLayoutDimension::HEIGHT);
 
     // Input shape, kernel size and output tile
     const Size2D kernel_size = Size2D(weights->tensor_shape()[idx_width], weights->tensor_shape()[idx_height]);
 
     // Strides
     const auto strides = conv_info.stride();
     ARM_COMPUTE_RETURN_ERROR_ON(strides.first != strides.second && strides.first != 1);
     ARM_COMPUTE_RETURN_ERROR_ON(kernel_size.x() != kernel_size.y());
     ARM_COMPUTE_RETURN_ERROR_ON(conv_info.pad_left() != (kernel_size.x() / 2) || conv_info.pad_right() != (kernel_size.x() / 2));
     ARM_COMPUTE_RETURN_ERROR_ON(conv_info.pad_top() != (kernel_size.y() / 2) || conv_info.pad_bottom() != (kernel_size.y() / 2));
 
     // Validate biases
     if(biases != nullptr)
     {
         ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, biases);
         ARM_COMPUTE_RETURN_ERROR_ON(weights->tensor_shape()[3] != biases->tensor_shape().x());
     }
 
     // Checks performed when output is configured
     if((output != nullptr) && (output->total_size() != 0))
     {
         ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, output);
         ARM_COMPUTE_RETURN_ERROR_ON((input->tensor_shape()[idx_height] != output->tensor_shape()[idx_height]) || (input->tensor_shape()[idx_width] != output->tensor_shape()[idx_width]));
 
         // Validate Activation Layer
         if(act_info.enabled())
         {
             ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(output, nullptr, act_info));
         }
     }
 
     return Status{};
 }

The documentation for this class was generated from the following files:

arm_compute/runtime/CL/functions/CLFFTConvolutionLayer.h
src/runtime/CL/functions/CLFFTConvolutionLayer.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ CLFFTConvolutionLayer() [1/3]

◆ CLFFTConvolutionLayer() [2/3]

◆ CLFFTConvolutionLayer() [3/3]

Member Function Documentation

◆ configure() [1/2]

◆ configure() [2/2]

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ prepare()

◆ run()

◆ validate()