Compute Library
 21.02
CLDepthwiseConvolutionLayer Class Reference

Function to execute a depthwise convolution. More...

#include <CLDepthwiseConvolutionLayer.h>

Collaboration diagram for CLDepthwiseConvolutionLayer:
[legend]

Public Member Functions

 CLDepthwiseConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 CLDepthwiseConvolutionLayer (const CLDepthwiseConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLDepthwiseConvolutionLayer (CLDepthwiseConvolutionLayer &&)=default
 Default move constructor. More...
 
CLDepthwiseConvolutionLayeroperator= (const CLDepthwiseConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLDepthwiseConvolutionLayeroperator= (CLDepthwiseConvolutionLayer &&)=default
 Default move assignment operator. More...
 
 ~CLDepthwiseConvolutionLayer ()
 Default destructor. More...
 
void configure (ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, ActivationLayerInfo act_info=ActivationLayerInfo(), const Size2D &dilation=Size2D(1U, 1U))
 Initialize the function's source, destination, weights and convolution information. More...
 
void configure (const CLCompileContext &compile_context, ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, ActivationLayerInfo act_info=ActivationLayerInfo(), const Size2D &dilation=Size2D(1U, 1U))
 Initialize the function's source, destination, weights and convolution information. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, ActivationLayerInfo act_info=ActivationLayerInfo(), const Size2D &dilation=Size2D(1U, 1U))
 Static function to check if given info will lead to a valid configuration of CLDepthwiseConvolutionLayer. More...
 

Detailed Description

Function to execute a depthwise convolution.

Definition at line 44 of file CLDepthwiseConvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLDepthwiseConvolutionLayer() [1/3]

CLDepthwiseConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 564 of file CLDepthwiseConvolutionLayer.cpp.

565  : _memory_manager(std::move(memory_manager)), _depth_conv_func(DepthwiseConvolutionFunction::GENERIC), _func_3x3(), _func_generic()
566 {
567 }

◆ CLDepthwiseConvolutionLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLDepthwiseConvolutionLayer() [3/3]

Default move constructor.

◆ ~CLDepthwiseConvolutionLayer()

Default destructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
ActivationLayerInfo  act_info = ActivationLayerInfo(),
const Size2D dilation = Size2D(1U, 1U) 
)

Initialize the function's source, destination, weights and convolution information.

Parameters
[in,out]inputSource tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/FP16/FP32. Data layout supported: NHWC, NCHW
[in]weightsWeights tensor. These are 3D tensors with shape [kernel_x, kernel_y, IFM]. Data type supported: Same as input or QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL when input is QASYMM8.
[in]biasesBiases tensor. A 1D tensor with shape [IFM]. Must be nullptr if not needed. Data type supported: Same as input, S32 when input is QASYMM8/QASYMM8_SIGNED.
[out]outputDestination tensor. Data type supported: same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).

Definition at line 569 of file CLDepthwiseConvolutionLayer.cpp.

References CLKernelLibrary::get().

571 {
572  configure(CLKernelLibrary::get().get_compile_context(), input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation);
573 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void configure(ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, ActivationLayerInfo act_info=ActivationLayerInfo(), const Size2D &dilation=Size2D(1U, 1U))
Initialize the function&#39;s source, destination, weights and convolution information.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
ActivationLayerInfo  act_info = ActivationLayerInfo(),
const Size2D dilation = Size2D(1U, 1U) 
)

Initialize the function's source, destination, weights and convolution information.

Parameters
[in]compile_contextThe compile context to be used.
[in,out]inputSource tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/FP16/FP32. Data layout supported: NHWC, NCHW
[in]weightsWeights tensor. These are 3D tensors with shape [kernel_x, kernel_y, IFM]. Data type supported: Same as input or QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL when input is QASYMM8.
[in]biasesBiases tensor. A 1D tensor with shape [IFM]. Must be nullptr if not needed. Data type supported: Same as input, S32 when input is QASYMM8/QASYMM8_SIGNED.
[out]outputDestination tensor. Data type supported: same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).

Definition at line 575 of file CLDepthwiseConvolutionLayer.cpp.

References ARM_COMPUTE_ERROR, arm_compute::test::validation::conv_info, arm_compute::GENERIC, CLScheduler::get(), ITensor::info(), arm_compute::OPTIMIZED, and CLScheduler::target().

579 {
580  const GPUTarget gpu_target = CLScheduler::get().target();
581  _depth_conv_func = get_depthwiseconvolution_function(input->info(), weights->info(), (biases != nullptr) ? biases->info() : nullptr, output->info(), conv_info, depth_multiplier, act_info,
582  dilation, gpu_target);
583  switch(_depth_conv_func)
584  {
586  _func_3x3.set_memory_group(_memory_manager);
587  _func_3x3.configure(compile_context, input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation);
588  break;
590  {
591  _func_generic.set_memory_group(_memory_manager);
592  _func_generic.configure(compile_context, input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation);
593  }
594  break;
595  default:
596  ARM_COMPUTE_ERROR("Unsupported DepthwiseConvolutionFunction");
597  }
598 }
static CLScheduler & get()
Access the scheduler singleton.
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
GPUTarget target() const
Get the target GPU.
Definition: CLScheduler.cpp:47
GPUTarget
Available GPU Targets.
Definition: GPUTarget.h:34

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 646 of file CLDepthwiseConvolutionLayer.cpp.

References ARM_COMPUTE_ERROR, arm_compute::GENERIC, and arm_compute::OPTIMIZED.

647 {
648  switch(_depth_conv_func)
649  {
651  _func_3x3.prepare();
652  break;
654  _func_generic.prepare();
655  break;
656  default:
657  ARM_COMPUTE_ERROR("DepthwiseConvolutionFunction not properly configured");
658  }
659 }
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 631 of file CLDepthwiseConvolutionLayer.cpp.

References ARM_COMPUTE_ERROR, arm_compute::GENERIC, and arm_compute::OPTIMIZED.

632 {
633  switch(_depth_conv_func)
634  {
636  _func_3x3.run();
637  break;
639  _func_generic.run();
640  break;
641  default:
642  ARM_COMPUTE_ERROR("DepthwiseConvolutionFunction not properly configured");
643  }
644 }
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
ActivationLayerInfo  act_info = ActivationLayerInfo(),
const Size2D dilation = Size2D(1U, 1U) 
)
static

Static function to check if given info will lead to a valid configuration of CLDepthwiseConvolutionLayer.

Parameters
[in]inputSource tensor info. Data type supported: QASYMM8/QASYMM8_SIGNED/FP16/FP32. Data layout supported: NHWC, NCHW
[in]weightsWeights tensor info. These are 3D tensors with shape [kernel_x, kernel_y, IFM]. Data type supported: Same as input or QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL when input is QASYMM8.
[in]biasesBiases tensor info. A 1D tensor with shape [IFM]. Must be nullptr if not needed. Data type supported: Same as input, S32 when input is QASYMM8/QASYMM8_SIGNED.
[in]outputDestination tensor. Data type supported: same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]act_info(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU for 3x3 QASYMM8 supported.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
Returns
a status

Definition at line 600 of file CLDepthwiseConvolutionLayer.cpp.

References ARM_COMPUTE_ERROR, ITensorInfo::data_type(), arm_compute::GENERIC, CLScheduler::get(), arm_compute::get_arch_from_target(), arm_compute::is_data_type_float(), arm_compute::MIDGARD, arm_compute::OPTIMIZED, CLScheduler::target(), and arm_compute::validate().

Referenced by arm_compute::test::validation::DATA_TEST_CASE().

602 {
603  const GPUTarget gpu_target = CLScheduler::get().target();
604  DepthwiseConvolutionFunction depth_conv_func = get_depthwiseconvolution_function(input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation, gpu_target);
605  switch(depth_conv_func)
606  {
608  return CLDepthwiseConvolutionLayerInternal3x3::validate(input, weights, biases, output, conv_info, depth_multiplier, act_info, gpu_target, dilation);
610  return CLDepthwiseConvolutionLayerGeneric::validate(input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation);
611  default:
612  ARM_COMPUTE_ERROR("Unsupported DepthwiseConvolutionFunction");
613  }
614 }
DepthwiseConvolutionFunction
Available DepthwiseConvolutionFunction.
Definition: Types.h:148
static CLScheduler & get()
Access the scheduler singleton.
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
GPUTarget target() const
Get the target GPU.
Definition: CLScheduler.cpp:47
GPUTarget
Available GPU Targets.
Definition: GPUTarget.h:34
Status validate(const ITensorInfo *scores_in, const ITensorInfo *boxes_in, const ITensorInfo *batch_splits_in, const ITensorInfo *scores_out, const ITensorInfo *boxes_out, const ITensorInfo *classes, const ITensorInfo *batch_splits_out, const ITensorInfo *keeps, const ITensorInfo *keeps_size, const BoxNMSLimitInfo info)

The documentation for this class was generated from the following files: