Compute Library
 19.11
CLDepthwiseConvolutionLayer Class Reference

Function to execute a depthwise convolution. More...

#include <CLDepthwiseConvolutionLayer.h>

Collaboration diagram for CLDepthwiseConvolutionLayer:
[legend]

Public Member Functions

 CLDepthwiseConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 CLDepthwiseConvolutionLayer (const CLDepthwiseConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLDepthwiseConvolutionLayer (CLDepthwiseConvolutionLayer &&)=default
 Default move constructor. More...
 
CLDepthwiseConvolutionLayeroperator= (const CLDepthwiseConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLDepthwiseConvolutionLayeroperator= (CLDepthwiseConvolutionLayer &&)=default
 Default move assignment operator. More...
 
void configure (ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, ActivationLayerInfo act_info=ActivationLayerInfo(), const Size2D &dilation=Size2D(1U, 1U))
 Initialize the function's source, destination, weights and convolution information. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, ActivationLayerInfo act_info=ActivationLayerInfo(), const Size2D &dilation=Size2D(1U, 1U))
 Static function to check if given info will lead to a valid configuration of CLDepthwiseConvolutionLayer. More...
 

Detailed Description

Function to execute a depthwise convolution.

Definition at line 45 of file CLDepthwiseConvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLDepthwiseConvolutionLayer() [1/3]

CLDepthwiseConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 570 of file CLDepthwiseConvolutionLayer.cpp.

571  : _memory_manager(std::move(memory_manager)), _depth_conv_func(DepthwiseConvolutionFunction::GENERIC), _func_3x3(), _func_generic()
572 {
573 }

References arm_compute::GENERIC.

◆ CLDepthwiseConvolutionLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLDepthwiseConvolutionLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure()

void configure ( ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
ActivationLayerInfo  act_info = ActivationLayerInfo(),
const Size2D dilation = Size2D(1U, 1U) 
)

Initialize the function's source, destination, weights and convolution information.

Parameters
[in,out]inputSource tensor. Data type supported: QASYMM8/FP16/FP32. Data layout supported: NHWC, NCHW
[in]weightsWeights tensor. These are 3D tensors with shape [kernel_x, kernel_y, IFM]. Data type supported: Same as input or QASYMM8/QSYMM8_PER_CHANNEL when input is QASYMM8.
[in]biasesBiases tensor. A 1D tensor with shape [IFM]. Must be nullptr if not needed. Data type supported: Same as input, S32 when input is QASYMM8.
[out]outputDestination tensor. Data type supported: same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).

Definition at line 575 of file CLDepthwiseConvolutionLayer.cpp.

577 {
578  const GPUTarget gpu_target = CLScheduler::get().target();
579  _depth_conv_func = get_depthwiseconvolution_function(input->info(), weights->info(), (biases != nullptr) ? biases->info() : nullptr, output->info(), conv_info, depth_multiplier, act_info,
580  dilation, gpu_target);
581  switch(_depth_conv_func)
582  {
584  _func_3x3.set_memory_group(_memory_manager);
585  _func_3x3.configure(input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation);
586  break;
588  {
589  _func_generic.set_memory_group(_memory_manager);
590  _func_generic.configure(input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation);
591  }
592  break;
593  default:
594  ARM_COMPUTE_ERROR("Unsupported DepthwiseConvolutionFunction");
595  }
596 }
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: CLTensor.cpp:41
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:99
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
GPUTarget target() const
Get the target GPU.
Definition: CLScheduler.cpp:47
GPUTarget
Available GPU Targets.
Definition: GPUTarget.h:34

References arm_compute::test::validation::act_info, ARM_COMPUTE_ERROR, arm_compute::test::validation::conv_info, arm_compute::test::validation::dilation, arm_compute::GENERIC, CLScheduler::get(), ITensor::info(), CLTensor::info(), arm_compute::test::validation::input, arm_compute::OPTIMIZED, CLScheduler::target(), and arm_compute::test::validation::weights.

Referenced by CLDepthwiseConvolutionLayer3x3::configure().

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 644 of file CLDepthwiseConvolutionLayer.cpp.

645 {
646  switch(_depth_conv_func)
647  {
649  _func_3x3.prepare();
650  break;
652  _func_generic.prepare();
653  break;
654  default:
655  ARM_COMPUTE_ERROR("DepthwiseConvolutionFunction not properly configured");
656  }
657 }
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352

References ARM_COMPUTE_ERROR, arm_compute::GENERIC, and arm_compute::OPTIMIZED.

Referenced by CLDepthwiseConvolutionLayer3x3::prepare().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 629 of file CLDepthwiseConvolutionLayer.cpp.

630 {
631  switch(_depth_conv_func)
632  {
634  _func_3x3.run();
635  break;
637  _func_generic.run();
638  break;
639  default:
640  ARM_COMPUTE_ERROR("DepthwiseConvolutionFunction not properly configured");
641  }
642 }
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352

References ARM_COMPUTE_ERROR, arm_compute::GENERIC, and arm_compute::OPTIMIZED.

Referenced by CLDepthwiseConvolutionLayer3x3::run().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
ActivationLayerInfo  act_info = ActivationLayerInfo(),
const Size2D dilation = Size2D(1U, 1U) 
)
static

Static function to check if given info will lead to a valid configuration of CLDepthwiseConvolutionLayer.

Parameters
[in]inputSource tensor info. Data type supported: QASYMM8/FP16/FP32. Data layout supported: NHWC, NCHW
[in]weightsWeights tensor info. These are 3D tensors with shape [kernel_x, kernel_y, IFM]. Data type supported: Same as input or QASYMM8/QSYMM8_PER_CHANNEL when input is QASYMM8.
[in]biasesBiases tensor info. A 1D tensor with shape [IFM]. Must be nullptr if not needed. Data type supported: Same as input, S32 when input is QASYMM8.
[in]outputDestination tensor. Data type supported: same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]act_info(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU for 3x3 QASYMM8 supported.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
Returns
a status

Definition at line 598 of file CLDepthwiseConvolutionLayer.cpp.

600 {
601  const GPUTarget gpu_target = CLScheduler::get().target();
602  DepthwiseConvolutionFunction depth_conv_func = get_depthwiseconvolution_function(input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation, gpu_target);
603  switch(depth_conv_func)
604  {
606  return CLDepthwiseConvolutionLayerInternal3x3::validate(input, weights, biases, output, conv_info, depth_multiplier, act_info, gpu_target, dilation);
608  return CLDepthwiseConvolutionLayerGeneric::validate(input, weights, biases, output, conv_info, depth_multiplier, act_info, dilation);
609  default:
610  ARM_COMPUTE_ERROR("Unsupported DepthwiseConvolutionFunction");
611  }
612 }
DepthwiseConvolutionFunction
Available DepthwiseConvolutionFunction.
Definition: Types.h:143
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:99
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
GPUTarget target() const
Get the target GPU.
Definition: CLScheduler.cpp:47
GPUTarget
Available GPU Targets.
Definition: GPUTarget.h:34
Status validate(const ITensorInfo *scores_in, const ITensorInfo *boxes_in, const ITensorInfo *batch_splits_in, const ITensorInfo *scores_out, const ITensorInfo *boxes_out, const ITensorInfo *classes, const ITensorInfo *batch_splits_out, const ITensorInfo *keeps, const ITensorInfo *keeps_size, const BoxNMSLimitInfo info)

References arm_compute::test::validation::act_info, ARM_COMPUTE_ERROR, arm_compute::test::validation::conv_info, arm_compute::test::validation::dilation, arm_compute::GENERIC, CLScheduler::get(), arm_compute::test::validation::input, arm_compute::OPTIMIZED, CLScheduler::target(), arm_compute::validate(), and arm_compute::test::validation::weights.


The documentation for this class was generated from the following files: