Compute Library
 24.04
NEConvolutionLayer Class Reference

Basic function to simulate a convolution layer. More...

#include <NEConvolutionLayer.h>

Collaboration diagram for NEConvolutionLayer:
[legend]

Public Member Functions

 NEConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
 NEConvolutionLayer (const NEConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEConvolutionLayeroperator= (const NEConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEConvolutionLayer (NEConvolutionLayer &&)=default
 Default move constructor. More...
 
NEConvolutionLayeroperator= (NEConvolutionLayer &&)=default
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
 ~NEConvolutionLayer ()
 Default destructor. More...
 
void configure (ITensor *input, const ITensor *weights, const ITensor *biases, ITensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1)
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1)
 Static function to check if given info will lead to a valid configuration of NEConvolutionLayer. More...
 
static ConvolutionMethod get_convolution_method (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Static function to check if given info will return the convolution called by NEConvolutionLayer. More...
 

Detailed Description

Basic function to simulate a convolution layer.

This function calls one of the following functions:

  1. cpu::CpuGemmConv2d (executed only in case GEMM is required for the operation)
  2. cpu::CpuWinogradConv2d (executed only in case Winograd is required for the operation)
  3. cpu::CpuDirectConv2d (executed only in case Direct Convolution is required for the operation)
  4. NEFFTConvolutionLayer (executed only in case FFT is required for the operation)

The function selects one of the algorithms mentioned above based on:

  • The size of the kernel
  • Number of input/output feature maps
  • Amount of memory needed

Generally GEMM-based convolution is executed when neither Winograd nor FFT nor Direct convolution can be performed.

FP32 Algorithm Filter Size Input/Output feature maps
Winograd 3x3 1x3 3x1 5x1 1x5 5x5(fast maths) 7x1 1x7 Input channels is greater than 3
FFT Squared kernels and greater than 9x9 Input feature maps > Output feature maps
DirectConv 9x9
GEMM Any size

Winograd 5x5 requires fast maths enabled.

FP16 Algorithm Filter Size
Winograd Not supported
FFT Not supported
DirectConv 9x9
GEMM Any size

Definition at line 72 of file NEConvolutionLayer.h.

Constructor & Destructor Documentation

◆ NEConvolutionLayer() [1/3]

NEConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 56 of file NEConvolutionLayer.cpp.

56  : _impl(std::make_unique<Impl>())
57 {
58  _impl->memory_manager = std::move(memory_manager);
59 }

◆ NEConvolutionLayer() [2/3]

NEConvolutionLayer ( const NEConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEConvolutionLayer() [3/3]

Default move constructor.

◆ ~NEConvolutionLayer()

~NEConvolutionLayer ( )
default

Default destructor.

Member Function Documentation

◆ configure()

void configure ( ITensor input,
const ITensor weights,
const ITensor biases,
ITensor output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false,
unsigned int  num_groups = 1 
)

Set the input and output tensors.

Valid data layouts:

  • NHWC
  • NCHW

Valid data type configurations:

src0 src1 src2 dst
F16 F16 F16 F16
F32 F32 F32 F32
QASYMM8 QASYMM8 S32 QASYMM8
QASYMM8 QSYMM8_PER_CHANNEL S32 QASYMM8
QASYMM8_SIGNED QASYMM8_SIGNED S32 QASYMM8_SIGNED
QASYMM8_SIGNED QSYMM8_PER_CHANNEL S32 QASYMM8_SIGNED
Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as input, also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Same as input, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with cpu::kernels::CpuGemmTranspose1xWKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU supported.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported

Definition at line 63 of file NEConvolutionLayer.cpp.

73 {
74  // Perform validate step
75  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
78  input->info(), weights->info(), ((biases != nullptr) ? biases->info() : nullptr), output->info(), conv_info,
79  weights_info, dilation, act_info, enable_fast_math, num_groups));
80  ARM_COMPUTE_LOG_PARAMS(input, weights, biases, output, conv_info, weights_info, dilation, act_info,
81  enable_fast_math, num_groups);
82 
83  const Conv2dInfo info(conv_info, dilation, act_info, enable_fast_math, num_groups);
84  switch (cpu::CpuConv2d::get_convolution_method(input->info(), weights->info(), output->info(), conv_info,
85  weights_info, dilation, act_info, enable_fast_math))
86  {
91  {
92  auto f = std::make_unique<cpu::CpuConv2d>();
93  f->configure(input->info(), weights->info(), ((biases != nullptr) ? biases->info() : nullptr),
94  output->info(), conv_info, weights_info, dilation, act_info, enable_fast_math, num_groups);
95  _impl->op = std::move(f);
96  break;
97  }
99  {
100  auto f = std::make_unique<NEFFTConvolutionLayer>(_impl->memory_manager);
101  f->configure(input, weights, biases, output, conv_info, act_info);
102  _impl->func = std::move(f);
103  break;
104  }
105  default:
106  ARM_COMPUTE_ERROR("Not supported.");
107  break;
108  }
109 
110  if (_impl->op)
111  {
112  _impl->memory_group = MemoryGroup(std::move(_impl->memory_manager));
113  _impl->aux_mem_req = _impl->op->workspace();
114  _impl->run_pack = {{ACL_SRC_0, input}, {ACL_SRC_1, weights}, {ACL_SRC_2, biases}, {ACL_DST, output}};
115  _impl->prep_pack = {{ACL_SRC_1, weights}, {ACL_SRC_2, biases}};
116  _impl->workspace =
117  manage_workspace<Tensor>(_impl->aux_mem_req, _impl->memory_group, _impl->run_pack, _impl->prep_pack);
118  }
119 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, arm_compute::test::validation::act_info, ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, ARM_COMPUTE_UNUSED, arm_compute::test::validation::conv_info, arm_compute::DIRECT, arm_compute::FFT, arm_compute::GEMM, arm_compute::GEMM_CONV2D, CpuConv2d::get_convolution_method(), ITensor::info(), arm_compute::test::validation::info, arm_compute::test::validation::input, arm_compute::test::validation::num_groups, NEConvolutionLayer::validate(), arm_compute::test::validation::weights_info, and arm_compute::WINOGRAD.

Referenced by NEDeconvolutionLayer::configure().

◆ get_convolution_method()

ConvolutionMethod get_convolution_method ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)
static

Static function to check if given info will return the convolution called by NEConvolutionLayer.

Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input, also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED.
[in]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with cpu::kernels::CpuGemmTranspose1xWKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
Returns
the Convolution Method Hint

Definition at line 165 of file NEConvolutionLayer.cpp.

173 {
174  return cpu::CpuConv2d::get_convolution_method(input, weights, output, conv_info, weights_info, dilation, act_info,
175  enable_fast_math);
176 }

References arm_compute::test::validation::act_info, arm_compute::test::validation::conv_info, CpuConv2d::get_convolution_method(), arm_compute::test::validation::input, and arm_compute::test::validation::weights_info.

◆ operator=() [1/2]

NEConvolutionLayer& operator= ( const NEConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEConvolutionLayer& operator= ( NEConvolutionLayer &&  )
default

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 194 of file NEConvolutionLayer.cpp.

195 {
196  if (_impl->func)
197  {
198  _impl->func->prepare();
199  }
200  else
201  {
202  _impl->op->prepare(_impl->prep_pack);
203 
204  // Release temporary tensors that are only used in prepare stage
205  release_temporaries<Tensor>(_impl->aux_mem_req, _impl->workspace);
206  }
207 }

Referenced by NEDeconvolutionLayer::prepare(), and NEConvolutionLayer::run().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For CPU kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 178 of file NEConvolutionLayer.cpp.

179 {
180  prepare();
181 
182  MemoryGroupResourceScope scope_mg(_impl->memory_group);
183 
184  if (_impl->func)
185  {
186  _impl->func->run();
187  }
188  else
189  {
190  _impl->op->run(_impl->run_pack);
191  }
192 }

References NEConvolutionLayer::prepare().

Referenced by NEDeconvolutionLayer::run().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false,
unsigned int  num_groups = 1 
)
static

Static function to check if given info will lead to a valid configuration of NEConvolutionLayer.

Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input, also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Same as input, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[in]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with cpu::kernels::CpuGemmTranspose1xWKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported
Returns
a status

Definition at line 121 of file NEConvolutionLayer.cpp.

131 {
132  const Conv2dInfo info(conv_info, dilation, act_info, enable_fast_math, num_groups);
133 
134  ARM_COMPUTE_RETURN_ERROR_ON_MSG(!weights->are_values_constant(), "Dynamic weights are not supported");
135 
136  // Biases with dynamic values are not supported with quantized inputs.
137  if (biases)
138  {
139  ARM_COMPUTE_RETURN_ERROR_ON_MSG((!biases->are_values_constant() && is_data_type_quantized(input->data_type())),
140  "Dynamic Biases are not supported with quantized input data.");
141  }
142 
143  switch (cpu::CpuConv2d::get_convolution_method(input, weights, output, conv_info, weights_info, dilation, act_info,
144  enable_fast_math))
145  {
151  weights_info, dilation, act_info, enable_fast_math,
152  num_groups));
153  break;
156  NEFFTConvolutionLayer::validate(input, weights, biases, output, conv_info, act_info));
157  break;
158  default:
159  ARM_COMPUTE_ERROR("Not supported.");
160  break;
161  }
162  return Status{};
163 }

References arm_compute::test::validation::act_info, ITensorInfo::are_values_constant(), ARM_COMPUTE_ERROR, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::test::validation::conv_info, arm_compute::DIRECT, arm_compute::FFT, arm_compute::GEMM, arm_compute::GEMM_CONV2D, CpuConv2d::get_convolution_method(), arm_compute::test::validation::info, arm_compute::test::validation::input, arm_compute::is_data_type_quantized(), arm_compute::test::validation::num_groups, NEFFTConvolutionLayer::validate(), CpuConv2d::validate(), arm_compute::test::validation::weights_info, and arm_compute::WINOGRAD.

Referenced by NEConvolutionLayer::configure(), and NEDeconvolutionLayer::validate().


The documentation for this class was generated from the following files:
arm_compute::ConvolutionMethod::FFT
@ FFT
Convolution using FFT.
arm_compute::test::validation::weights_info
weights_info
Definition: BatchNormalizationLayer.cpp:165
arm_compute::NEConvolutionLayer::prepare
void prepare() override
Prepare the function for executing.
Definition: NEConvolutionLayer.cpp:194
ARM_COMPUTE_ERROR
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:354
arm_compute::ACL_SRC_0
@ ACL_SRC_0
Definition: Types.h:45
arm_compute::ACL_SRC_1
@ ACL_SRC_1
Definition: Types.h:46
arm_compute::ACL_SRC_2
@ ACL_SRC_2
Definition: Types.h:47
ARM_COMPUTE_RETURN_ON_ERROR
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:205
arm_compute::test::validation::act_info
act_info
Definition: DirectConvolutionLayer.cpp:547
ARM_COMPUTE_ERROR_ON_NULLPTR
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:159
ARM_COMPUTE_ERROR_THROW_ON
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
arm_compute::cpu::CpuConv2d::get_convolution_method
static ConvolutionMethod get_convolution_method(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *dst, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
Static function to check if given info will return the convolution called by CpuConv2d.
Definition: CpuConv2d.cpp:145
arm_compute::ACL_DST
@ ACL_DST
Definition: Types.h:55
arm_compute::ConvolutionMethod::WINOGRAD
@ WINOGRAD
Convolution using Winograd.
arm_compute::ConvolutionMethod::GEMM_CONV2D
@ GEMM_CONV2D
Direct 2D GEMM convolution.
arm_compute::ConvolutionMethod::GEMM
@ GEMM
Convolution using GEMM.
ARM_COMPUTE_UNUSED
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:151
arm_compute::ConvolutionMethod::DIRECT
@ DIRECT
Direct convolution.
arm_compute::test::validation::num_groups
const unsigned int num_groups
Definition: Im2Col.cpp:153
ARM_COMPUTE_RETURN_ERROR_ON_MSG
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond, msg)
If the condition is true, an error is returned.
Definition: Error.h:245
arm_compute::test::validation::conv_info
conv_info
Definition: DirectConvolutionLayer.cpp:547
arm_compute::is_data_type_quantized
bool is_data_type_quantized(DataType dt)
Check if a given data type is of quantized type.
Definition: DataTypeUtils.h:324
arm_compute::NEFFTConvolutionLayer::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
Static function to check if given info will lead to a valid configuration of NEFFTConvolutionLayer.
Definition: NEFFTConvolutionLayer.cpp:276
arm_compute::test::validation::info
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
ARM_COMPUTE_LOG_PARAMS
#define ARM_COMPUTE_LOG_PARAMS(...)
Definition: Log.h:35
arm_compute::test::validation::input
auto input
Definition: LSTMLayerQuantized.cpp:486
arm_compute::cpu::CpuConv2d::validate
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1)
Static function to check if given info will lead to a valid configuration of CpuConv2d.
Definition: CpuConv2d.cpp:106
arm_compute::NEConvolutionLayer::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1)
Static function to check if given info will lead to a valid configuration of NEConvolutionLayer.
Definition: NEConvolutionLayer.cpp:121