Compute Library
 23.08
CLConvolutionLayer Class Reference

Basic function to compute the convolution layer. More...

#include <CLConvolutionLayer.h>

Collaboration diagram for CLConvolutionLayer:
[legend]

Public Member Functions

 CLConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 ~CLConvolutionLayer ()
 Default Destructor. More...
 
 CLConvolutionLayer (const CLConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLConvolutionLayer (CLConvolutionLayer &&)=default
 Default move constructor. More...
 
CLConvolutionLayeroperator= (const CLConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLConvolutionLayeroperator= (CLConvolutionLayer &&)=default
 Default move assignment operator. More...
 
void configure (ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1, const experimental::PostOpList< ICLTensor * > &post_ops=experimental::PostOpList< ICLTensor * > {})
 Set the input and output tensors. More...
 
void configure (const CLCompileContext &compile_context, ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1, const experimental::PostOpList< ICLTensor * > &post_ops=experimental::PostOpList< ICLTensor * > {})
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1, const experimental::PostOpList< ITensorInfo * > &post_ops=experimental::PostOpList< ITensorInfo * > {})
 Static function to check if given info will lead to a valid configuration of CLConvolutionLayer. More...
 
static ConvolutionMethod get_convolution_method (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info, const ActivationLayerInfo &act_info, const GPUTarget gpu_target, const Size2D &dilation=Size2D(1U, 1U), bool enable_fast_math=false)
 Static function to check if given info will return the convolution called by CLConvolutionLayer. More...
 

Detailed Description

Basic function to compute the convolution layer.

This function calls the following OpenCL kernels/functions:

  1. opencl::ClGemmConv2d
  2. opencl::ClWinogradConv2d
  3. opencl::ClDirectConv2d
  4. CLFFTConvolutionLayer

The function selects one of the algorithms mentioned above based on:

  • The size of the kernel
  • Number of input/output feature maps
  • Amount of memory needed

Generally GEMM-based convolution is executed when neither Winograd nor FFT nor Direct convolution can be performed.

FP32 Algorithm Filter Size Input/Output feature maps
Winograd 3x3 1x3 3x1 5x1 1x5 5x5(fast maths) 7x1 1x7 Input channels is greater than 3
FFT Squared kernels and greater than 9x9 Input feature maps > Output feature maps
DirectConv 9x9
GEMM Any size

Winograd 5x5 requires fast maths enabled.

FP16 Algorithm Filter Size Input/Output feature maps
Winograd 3x3 1x3 3x1 5x1 1x5 5x5 Input channels is greater than 3
FFT Not supported
DirectConv 9x9
GEMM Any size

Winograd FP16 requires fast maths enabled.

Definition at line 76 of file CLConvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLConvolutionLayer() [1/3]

CLConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 55 of file CLConvolutionLayer.cpp.

56  : _impl(std::make_unique<Impl>())
57 {
58  _impl->memory_manager = std::move(memory_manager);
59 }

◆ ~CLConvolutionLayer()

~CLConvolutionLayer ( )
default

Default Destructor.

◆ CLConvolutionLayer() [2/3]

CLConvolutionLayer ( const CLConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLConvolutionLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( const CLCompileContext compile_context,
ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false,
unsigned int  num_groups = 1,
const experimental::PostOpList< ICLTensor * > &  post_ops = experimental::PostOpList<ICLTensor *> {} 
)

Set the input and output tensors.

Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as input, also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Same as input, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with CLWeightsReshapeKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is only supported for NCHW data layout
[in]post_ops(Optional) A sequence of post operations that are performed after the main operation.

Definition at line 69 of file CLConvolutionLayer.cpp.

72 {
73  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
74  ARM_COMPUTE_ERROR_THROW_ON(CLConvolutionLayer::validate(input->info(), weights->info(), ((biases != nullptr) ? biases->info() : nullptr), output->info(), conv_info, weights_info, dilation, act_info,
75  enable_fast_math, num_groups));
76  ARM_COMPUTE_LOG_PARAMS(input, weights, biases, output, conv_info, weights_info, dilation, act_info, enable_fast_math, num_groups, post_ops);
77 
78  // Convert post op arguments to ITensorInfo
79  auto transformed_post_ops = experimental::transform_post_op_list_arguments<ICLTensor *, ITensorInfo *>(post_ops, [](auto tensor)
80  {
81  return tensor->info();
82  });
83  const Conv2dInfo conv2d_info = Conv2dInfo(conv_info, dilation, act_info, enable_fast_math, num_groups, transformed_post_ops);
84 
85  switch(opencl::ClConv2d::get_convolution_method(input->info(), weights->info(), output->info(), conv2d_info,
87  {
92  {
93  auto f = std::make_unique<opencl::ClConv2d>();
94  f->configure(compile_context, input->info(), weights->info(), ((biases != nullptr) ? biases->info() : nullptr), output->info(), conv2d_info, weights_info);
95  _impl->op = std::move(f);
96  break;
97  }
99  {
100  ARM_COMPUTE_ERROR_ON_MSG(post_ops.size() > 0, "CLFFTConvolutionLayer does not support post ops");
101  auto f = std::make_unique<CLFFTConvolutionLayer>(_impl->memory_manager);
102  f->configure(compile_context, input, weights, biases, output, conv_info, act_info, enable_fast_math);
103  _impl->func = std::move(f);
104  break;
105  }
106  default:
107  ARM_COMPUTE_ERROR("Not supported.");
108  break;
109  }
110 
111  if(_impl->op)
112  {
113  _impl->memory_group = MemoryGroup(std::move(_impl->memory_manager));
114  _impl->aux_mem_req = _impl->op->workspace();
115  _impl->run_pack = { { ACL_SRC_0, input }, { ACL_SRC_1, weights }, { ACL_SRC_2, biases }, { ACL_DST, output } };
116  size_t post_op_tensor_index = 0;
117  for(const auto &op : post_ops.get_list())
118  {
119  for(auto &tensor : op->arguments())
120  {
121  _impl->run_pack.add_const_tensor(experimental::get_post_op_arg_type(post_op_tensor_index++), *tensor);
122  }
123  }
124  _impl->prep_pack = { { ACL_SRC_1, weights }, { ACL_SRC_2, biases } };
125  _impl->workspace = manage_workspace<CLTensor>(_impl->aux_mem_req, _impl->memory_group, _impl->run_pack, _impl->prep_pack);
126  }
127 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, arm_compute::test::validation::act_info, ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_MSG, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, arm_compute::test::validation::conv_info, arm_compute::DIRECT, arm_compute::FFT, arm_compute::GEMM, CLScheduler::get(), ClConv2d::get_convolution_method(), arm_compute::experimental::get_post_op_arg_type(), arm_compute::INDIRECT, ITensor::info(), arm_compute::test::validation::input, arm_compute::test::validation::num_groups, arm_compute::test::validation::post_ops, CLScheduler::target(), tensor, CLConvolutionLayer::validate(), arm_compute::test::validation::weights_info, and arm_compute::WINOGRAD.

◆ configure() [2/2]

void configure ( ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false,
unsigned int  num_groups = 1,
const experimental::PostOpList< ICLTensor * > &  post_ops = experimental::PostOpList<ICLTensor *> {} 
)

Set the input and output tensors.

Valid data layouts:

  • NHWC
  • NCHW

Valid data type configurations:

src0 src1 src2 dst
F16 F16 F16 F16
F32 F32 F32 F32
QASYMM8 QASYMM8 S32 QASYMM8
QASYMM8 QSYMM8_PER_CHANNEL S32 QASYMM8
QASYMM8_SIGNED QASYMM8_SIGNED S32 QASYMM8_SIGNED
QASYMM8_SIGNED QSYMM8_PER_CHANNEL S32 QASYMM8_SIGNED
Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as input, also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Same as input, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with CLWeightsReshapeKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is only supported for NCHW data layout
[in]post_ops(Optional) A sequence of post operations that are performed after the main operation.

Definition at line 63 of file CLConvolutionLayer.cpp.

65 {
66  configure(CLKernelLibrary::get().get_compile_context(), input, weights, biases, output, conv_info, weights_info, dilation, act_info, enable_fast_math, num_groups, post_ops);
67 }

References arm_compute::test::validation::act_info, arm_compute::test::validation::conv_info, CLKernelLibrary::get(), arm_compute::test::validation::input, arm_compute::test::validation::num_groups, arm_compute::test::validation::post_ops, and arm_compute::test::validation::weights_info.

Referenced by CLDirectDeconvolutionLayer::configure().

◆ get_convolution_method()

ConvolutionMethod get_convolution_method ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info,
const ActivationLayerInfo act_info,
const GPUTarget  gpu_target,
const Size2D dilation = Size2D(1U, 1U),
bool  enable_fast_math = false 
)
static

Static function to check if given info will return the convolution called by CLConvolutionLayer.

Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as input, also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED.
[in]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with CLWeightsReshapeKernel.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]gpu_targetSpecifies the GPUTarget.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
Returns
the Convolution Method Hint

Definition at line 164 of file CLConvolutionLayer.cpp.

166 {
167  const Conv2dInfo conv2d_info = Conv2dInfo(conv_info, dilation, act_info, enable_fast_math, 1);
168  return opencl::ClConv2d::get_convolution_method(input, weights, output, conv2d_info, weights_info, gpu_target);
169 }

References arm_compute::test::validation::act_info, arm_compute::test::validation::conv_info, ClConv2d::get_convolution_method(), arm_compute::test::validation::input, and arm_compute::test::validation::weights_info.

◆ operator=() [1/2]

CLConvolutionLayer& operator= ( CLConvolutionLayer &&  )
default

Default move assignment operator.

◆ operator=() [2/2]

CLConvolutionLayer& operator= ( const CLConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 187 of file CLConvolutionLayer.cpp.

188 {
189  if(_impl->func)
190  {
191  _impl->func->prepare();
192  }
193  else
194  {
195  _impl->op->prepare(_impl->prep_pack);
196 
197  // Release temporary tensors that are only used in prepare stage
198  release_temporaries(_impl->aux_mem_req, _impl->workspace);
199  }
200 }

References arm_compute::release_temporaries().

Referenced by CLDirectDeconvolutionLayer::prepare(), and CLConvolutionLayer::run().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For CPU kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 171 of file CLConvolutionLayer.cpp.

172 {
173  prepare();
174 
175  MemoryGroupResourceScope scope_mg(_impl->memory_group);
176 
177  if(_impl->func)
178  {
179  _impl->func->run();
180  }
181  else
182  {
183  _impl->op->run(_impl->run_pack);
184  }
185 }

References CLConvolutionLayer::prepare().

Referenced by CLDirectDeconvolutionLayer::run().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false,
unsigned int  num_groups = 1,
const experimental::PostOpList< ITensorInfo * > &  post_ops = experimental::PostOpList<ITensorInfo *> {} 
)
static

Static function to check if given info will lead to a valid configuration of CLConvolutionLayer.

Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as input, also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Same as input, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[in]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with CLWeightsReshapeKernel.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is only supported for NCHW data layout
[in]post_ops(Optional) A sequence of post operations that are performed after the main operation.
Returns
a status

Definition at line 129 of file CLConvolutionLayer.cpp.

131 {
132  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(input, weights, output);
133  ARM_COMPUTE_RETURN_ERROR_ON_MSG(!weights->are_values_constant(), "Dynamic weights are not supported");
134  ARM_COMPUTE_RETURN_ERROR_ON_MSG((num_groups != 1) && (input->data_layout() != DataLayout::NCHW), "Grouping (num_groups != 1) with NHWC data layout is not supported");
135 
136  const GPUTarget gpu_target = CLScheduler::get().target();
137  const Conv2dInfo conv2d_info = Conv2dInfo(conv_info, dilation, act_info, enable_fast_math, num_groups, post_ops);
138 
139  switch(opencl::ClConv2d::get_convolution_method(input, weights, output, conv2d_info, weights_info, gpu_target))
140  {
145  {
146  ARM_COMPUTE_RETURN_ON_ERROR(opencl::ClConv2d::validate(input, weights, biases, output, conv2d_info, weights_info));
147  break;
148  }
150  {
151  // Validate FFT-based convolution layer
152  ARM_COMPUTE_RETURN_ERROR_ON_MSG(post_ops.size() > 0, "CLFFTConvolutionLayer does not support post ops");
153  ARM_COMPUTE_RETURN_ON_ERROR(CLFFTConvolutionLayer::validate(input, weights, nullptr, output, conv_info, act_info, enable_fast_math));
154  break;
155  }
156  default:
157  ARM_COMPUTE_ERROR("Not supported.");
158  break;
159  }
160 
161  return Status{};
162 }

References arm_compute::test::validation::act_info, ITensorInfo::are_values_constant(), ARM_COMPUTE_ERROR, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::test::validation::conv_info, arm_compute::DIRECT, arm_compute::FFT, arm_compute::GEMM, CLScheduler::get(), ClConv2d::get_convolution_method(), arm_compute::INDIRECT, arm_compute::test::validation::input, arm_compute::NCHW, arm_compute::test::validation::num_groups, arm_compute::test::validation::post_ops, CLScheduler::target(), ClConv2d::validate(), CLFFTConvolutionLayer::validate(), arm_compute::test::validation::weights_info, and arm_compute::WINOGRAD.

Referenced by CLConvolutionLayer::configure(), and CLDirectDeconvolutionLayer::validate().


The documentation for this class was generated from the following files:
arm_compute::DataLayout::NCHW
@ NCHW
Num samples, channels, height, width.
arm_compute::ConvolutionMethod::FFT
@ FFT
Convolution using FFT.
arm_compute::opencl::ClConv2d::validate
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, const Conv2dInfo &conv2d_info, const WeightsInfo &weights_info=WeightsInfo())
Static function to check if given info will lead to a valid configuration of ClConv2d.
Definition: ClConv2d.cpp:131
arm_compute::test::validation::weights_info
weights_info
Definition: BatchNormalizationLayer.cpp:165
arm_compute::CLConvolutionLayer::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1, const experimental::PostOpList< ITensorInfo * > &post_ops=experimental::PostOpList< ITensorInfo * > {})
Static function to check if given info will lead to a valid configuration of CLConvolutionLayer.
Definition: CLConvolutionLayer.cpp:129
arm_compute::ConvolutionMethod::INDIRECT
@ INDIRECT
Indirect convolution.
ARM_COMPUTE_ERROR
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:353
arm_compute::opencl::ClConv2d::get_convolution_method
static ConvolutionMethod get_convolution_method(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *dst, const Conv2dInfo &conv2d_info, const WeightsInfo &weights_info, const GPUTarget gpu_target)
Static function to check if given info will return the convolution called by ClConv2d.
Definition: ClConv2d.cpp:179
arm_compute::CLConvolutionLayer::prepare
void prepare() override
Prepare the function for executing.
Definition: CLConvolutionLayer.cpp:187
arm_compute::ACL_SRC_0
@ ACL_SRC_0
Definition: Types.h:45
arm_compute::ACL_SRC_1
@ ACL_SRC_1
Definition: Types.h:46
arm_compute::CLKernelLibrary::get
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
Definition: CLKernelLibrary.cpp:39
arm_compute::ACL_SRC_2
@ ACL_SRC_2
Definition: Types.h:47
ARM_COMPUTE_RETURN_ON_ERROR
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
arm_compute::test::validation::act_info
act_info
Definition: DirectConvolutionLayer.cpp:547
ARM_COMPUTE_ERROR_ON_NULLPTR
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
ARM_COMPUTE_ERROR_THROW_ON
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:456
arm_compute::release_temporaries
void release_temporaries(const experimental::MemoryRequirements &mem_reqs, WorkspaceData< TensorType > &workspace)
Utility function to release tensors with lifetime marked as Prepare.
Definition: MemoryHelpers.h:122
ARM_COMPUTE_ERROR_ON_MSG
#define ARM_COMPUTE_ERROR_ON_MSG(cond, msg)
Definition: Error.h:457
arm_compute::ACL_DST
@ ACL_DST
Definition: Types.h:55
arm_compute::test::validation::post_ops
experimental::PostOpList< ITensorInfo * > post_ops
Definition: ConvolutionLayer.cpp:413
arm_compute::ConvolutionMethod::WINOGRAD
@ WINOGRAD
Convolution using Winograd.
arm_compute::ConvolutionMethod::GEMM
@ GEMM
Convolution using GEMM.
arm_compute::CLFFTConvolutionLayer::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
Static function to check if given info will lead to a valid configuration of CLFFTConvolutionLayer.
Definition: CLFFTConvolutionLayer.cpp:272
tensor
CLTensor * tensor
Pointer to the auxiliary tensor.
Definition: ClWorkloadRuntime.cpp:66
arm_compute::CLScheduler::get
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:103
arm_compute::ConvolutionMethod::DIRECT
@ DIRECT
Direct convolution.
arm_compute::CLScheduler::target
GPUTarget target() const
Get the target GPU.
Definition: CLScheduler.cpp:45
arm_compute::experimental::get_post_op_arg_type
TensorType get_post_op_arg_type(size_t index)
Get post op argument TensorType from post op argument index in a flattened, ordered post op argument ...
Definition: PostOpUtils.h:77
arm_compute::GPUTarget
GPUTarget
Available GPU Targets.
Definition: GPUTarget.h:34
arm_compute::test::validation::num_groups
const unsigned int num_groups
Definition: Im2Col.cpp:153
ARM_COMPUTE_RETURN_ERROR_ON_MSG
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond, msg)
If the condition is true, an error is returned.
Definition: Error.h:245
arm_compute::test::validation::conv_info
const auto conv_info
Definition: ConvolutionLayer.cpp:407
ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
ARM_COMPUTE_LOG_PARAMS
#define ARM_COMPUTE_LOG_PARAMS(...)
Definition: Log.h:35
arm_compute::CLConvolutionLayer::configure
void configure(ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1, const experimental::PostOpList< ICLTensor * > &post_ops=experimental::PostOpList< ICLTensor * > {})
Set the input and output tensors.
Definition: CLConvolutionLayer.cpp:63
arm_compute::test::validation::input
auto input
Definition: LSTMLayerQuantized.cpp:486