Compute Library
 19.08
NEDepthwiseConvolutionLayer3x3Kernel Class Reference

Interface for the kernel to run a 3x3 depthwise convolution on a tensor. More...

#include <NEDepthwiseConvolutionLayer3x3Kernel.h>

Collaboration diagram for NEDepthwiseConvolutionLayer3x3Kernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEDepthwiseConvolutionLayer3x3Kernel ()
 Default constructor. More...
 
 NEDepthwiseConvolutionLayer3x3Kernel (const NEDepthwiseConvolutionLayer3x3Kernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEDepthwiseConvolutionLayer3x3Kerneloperator= (const NEDepthwiseConvolutionLayer3x3Kernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEDepthwiseConvolutionLayer3x3Kernel (NEDepthwiseConvolutionLayer3x3Kernel &&)=default
 Default Move Constructor. More...
 
NEDepthwiseConvolutionLayer3x3Kerneloperator= (NEDepthwiseConvolutionLayer3x3Kernel &&)=default
 Default move assignment operator. More...
 
void configure (const ITensor *input, const ITensor *weights, ITensor *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, const Size2D &dilation=Size2D(1U, 1U))
 Initialize the function's source, destination, conv and border_size. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
BorderSize border_size () const override
 The size of the border for that kernel. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, const Size2D &dilation=Size2D(1U, 1U))
 Static function to check if given info will lead to a valid configuration of NEDepthwiseConvolutionLayer3x3Kernel. More...
 

Detailed Description

Interface for the kernel to run a 3x3 depthwise convolution on a tensor.

Definition at line 35 of file NEDepthwiseConvolutionLayer3x3Kernel.h.

Constructor & Destructor Documentation

◆ NEDepthwiseConvolutionLayer3x3Kernel() [1/3]

Default constructor.

Definition at line 242 of file NEDepthwiseConvolutionLayer3x3Kernel.cpp.

243  : _border_size(0), _input(), _output(), _weights(), _conv_info(), _num_elems_written_per_iteration(0), _depth_multiplier(1), _dilation()
244 {
245 }

◆ NEDepthwiseConvolutionLayer3x3Kernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEDepthwiseConvolutionLayer3x3Kernel() [3/3]

Member Function Documentation

◆ border_size()

BorderSize border_size ( ) const
overridevirtual

The size of the border for that kernel.

Returns
The width in number of elements of the border.

Reimplemented from IKernel.

Definition at line 247 of file NEDepthwiseConvolutionLayer3x3Kernel.cpp.

248 {
249  return _border_size;
250 }

◆ configure()

void configure ( const ITensor input,
const ITensor weights,
ITensor output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
const Size2D dilation = Size2D(1U, 1U) 
)

Initialize the function's source, destination, conv and border_size.

Note
Supported data layouts: NCHW and NHWC
Parameters
[in]inputSource tensor. DataType supported: QASYMM8/F16/F32.
[in]weightsWeights tensor. This is a 3D tensor with dimensions [3, 3, IFM] for NCHW or [IFM, 3, 3] if NHWC data layout. Data type supported: Same as input.
[out]outputDestination tensor. Data type supported: Same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).

Definition at line 252 of file NEDepthwiseConvolutionLayer3x3Kernel.cpp.

254 {
255  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
256  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), weights->info(), output->info(), conv_info, depth_multiplier, dilation));
257 
258  _input = input;
259  _output = output;
260  _weights = weights;
261  _conv_info = conv_info;
262  _depth_multiplier = depth_multiplier;
263  switch(input->info()->data_type())
264  {
265  case DataType::QASYMM8:
266  case DataType::F32:
267  _num_elems_written_per_iteration = 16 >> _conv_info.stride().first;
268  break;
269  case DataType::F16:
270  _num_elems_written_per_iteration = 32 >> _conv_info.stride().first;
271  break;
272  default:
273  ARM_COMPUTE_ERROR("Data type not supported.");
274  }
275  _border_size = BorderSize(_conv_info.pad_top(), _conv_info.pad_right(), _conv_info.pad_bottom(), _conv_info.pad_left());
276  _dilation = dilation;
277  auto win_config = validate_and_configure_window(_input->info(), _weights->info(), _output->info(), _conv_info, _depth_multiplier, dilation);
278  ARM_COMPUTE_ERROR_THROW_ON(win_config.first);
279  INEKernel::configure(win_config.second);
280 }
#define ARM_COMPUTE_ERROR(...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:261
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: CLTensor.cpp:35
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
1 channel, 1 F32 per channel
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
unsigned int pad_top() const
Get the top padding.
Definition: Types.h:760
1 channel, 1 F16 per channel
quantized, asymmetric fixed-point 8-bit number
std::pair< unsigned int, unsigned int > stride() const
Get the stride.
Definition: Types.h:724
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
unsigned int pad_right() const
Get the right padding.
Definition: Types.h:755
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
unsigned int pad_bottom() const
Get the bottom padding.
Definition: Types.h:765
unsigned int pad_left() const
Get the left padding.
Definition: Types.h:750

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::test::validation::conv_info, ITensorInfo::data_type(), arm_compute::test::validation::dilation, arm_compute::F16, arm_compute::F32, ITensor::info(), CLTensor::info(), PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), arm_compute::QASYMM8, PadStrideInfo::stride(), arm_compute::validate_and_configure_window(), and arm_compute::test::validation::weights.

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 38 of file NEDepthwiseConvolutionLayer3x3Kernel.h.

39  {
40  return "NEDepthwiseConvolutionLayer3x3Kernel";
41  }

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Default move assignment operator.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 290 of file NEDepthwiseConvolutionLayer3x3Kernel.cpp.

291 {
294 
296 
297  switch(_input->info()->data_type())
298  {
299 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
300  case DataType::F16:
301  convolve_3x3<float16_t, float16_t>(window, _num_elems_written_per_iteration, _input, _weights, _output, _conv_info, _depth_multiplier, _dilation);
302  break;
303 #endif // __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
304  case DataType::F32:
305  convolve_3x3<float, float>(window, _num_elems_written_per_iteration, _input, _weights, _output, _conv_info, _depth_multiplier, _dilation);
306  break;
307  case DataType::QASYMM8:
308  convolve_3x3<uint8_t, int32_t>(window, _num_elems_written_per_iteration, _input, _weights, _output, _conv_info, _depth_multiplier, _dilation);
309  break;
310  default:
311  ARM_COMPUTE_ERROR("Not implemented");
312  }
313 }
#define ARM_COMPUTE_ERROR(...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:261
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
1 channel, 1 F16 per channel
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
quantized, asymmetric fixed-point 8-bit number
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensorInfo::data_type(), arm_compute::F16, arm_compute::F32, ITensor::info(), arm_compute::test::validation::info, arm_compute::QASYMM8, and IKernel::window().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
const Size2D dilation = Size2D(1U, 1U) 
)
static

Static function to check if given info will lead to a valid configuration of NEDepthwiseConvolutionLayer3x3Kernel.

Note
Supported data layouts: NCHW and NHWC
Parameters
[in]inputSource tensor info. DataType supported: QASYMM8/F16/F32.
[in]weightsWeights tensor info. This is a 3D tensor with dimensions [3, 3, IFM] for NCHW or [IFM, 3, 3] if NHWC data layout. Data type supported: Same as input.
[in]outputDestination tensor info. Data type supported: Same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
Returns
a status

Definition at line 282 of file NEDepthwiseConvolutionLayer3x3Kernel.cpp.

284 {
285  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input, weights, output, conv_info, depth_multiplier, dilation));
286  ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window(input->clone().get(), weights->clone().get(), output->clone().get(), conv_info, depth_multiplier, dilation).first);
287  return Status{};
288 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193

References ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), arm_compute::test::validation::conv_info, arm_compute::test::validation::dilation, arm_compute::validate_and_configure_window(), and arm_compute::test::validation::weights.

Referenced by NEDepthwiseConvolutionLayer3x3::validate(), and NEDepthwiseConvolutionLayerOptimized::validate().


The documentation for this class was generated from the following files: