Compute Library
 21.02
NEDepthwiseConvolutionLayerNativeKernel Class Reference

Interface for the kernel to run a depthwise convolution native on a tensor. More...

#include <NEDepthwiseConvolutionLayerNativeKernel.h>

Collaboration diagram for NEDepthwiseConvolutionLayerNativeKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEDepthwiseConvolutionLayerNativeKernel ()
 Default constructor. More...
 
 NEDepthwiseConvolutionLayerNativeKernel (const NEDepthwiseConvolutionLayerNativeKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEDepthwiseConvolutionLayerNativeKerneloperator= (const NEDepthwiseConvolutionLayerNativeKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEDepthwiseConvolutionLayerNativeKernel (NEDepthwiseConvolutionLayerNativeKernel &&)=default
 Default Move Constructor. More...
 
NEDepthwiseConvolutionLayerNativeKerneloperator= (NEDepthwiseConvolutionLayerNativeKernel &&)=default
 Default move assignment operator. More...
 
 ~NEDepthwiseConvolutionLayerNativeKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, const ITensor *weights, const ITensor *biases, ITensor *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, const Size2D &dilation=Size2D(1U, 1U))
 Initialize the function's source, destination and parameters. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, const Size2D &dilation=Size2D(1U, 1U))
 Static function to check if given info will lead to a valid configuration of NEDepthwiseConvolutionLayerNativeKernel. More...
 

Detailed Description

Interface for the kernel to run a depthwise convolution native on a tensor.

Definition at line 41 of file NEDepthwiseConvolutionLayerNativeKernel.h.

Constructor & Destructor Documentation

◆ NEDepthwiseConvolutionLayerNativeKernel() [1/3]

Default constructor.

Definition at line 770 of file NEDepthwiseConvolutionLayerNativeKernel.cpp.

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

771  : _func(), _input(), _weights(), _biases(), _output(), _conv_info(), _depth_multiplier(1), _dilation(), _output_multiplier(), _output_shift(), _has_biases()
772 {
773 }

◆ NEDepthwiseConvolutionLayerNativeKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEDepthwiseConvolutionLayerNativeKernel() [3/3]

◆ ~NEDepthwiseConvolutionLayerNativeKernel()

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
const ITensor weights,
const ITensor biases,
ITensor output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
const Size2D dilation = Size2D(1U, 1U) 
)

Initialize the function's source, destination and parameters.

Note
Supported data layouts: NHWC
Parameters
[in]inputSource tensor. DataType supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. This is a 3D tensor with dimensions [IFM, W, H]. Data type supported: Same as input or QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL when input is QASYMM8/QASYMM8_SIGNED.
[in]biasesBiases tensor. A 1D tensor with dimensions [IFM]. Must be nullptr if not needed. Data type supported: Same as input, S32 when input is QASYMM8/QASYMM8_SIGNED.
[out]outputDestination tensor. Data type supported: Same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).

Definition at line 775 of file NEDepthwiseConvolutionLayerNativeKernel.cpp.

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window(), arm_compute::quantization::calculate_quantized_multiplier(), ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_depthwise_convolution_shape(), arm_compute::test::validation::conv_info, ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, ITensor::info(), arm_compute::test::validation::input, arm_compute::is_data_type_quantized(), arm_compute::is_data_type_quantized_per_channel(), ITensorInfo::num_dimensions(), arm_compute::test::validation::output_shape, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8_PER_CHANNEL, ITensorInfo::quantization_info(), UniformQuantizationInfo::scale, QuantizationInfo::scale(), Dimensions< T >::set_num_dimensions(), ITensorInfo::set_valid_region(), ITensorInfo::tensor_shape(), QuantizationInfo::uniform(), and arm_compute::validate_arguments().

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name(), and arm_compute::test::validation::TEST_CASE().

777 {
778  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
779  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), weights->info(), (biases != nullptr) ? biases->info() : nullptr, output->info(), conv_info, depth_multiplier, dilation));
780 
781  _input = input;
782  _weights = weights;
783  _biases = biases;
784  _output = output;
785  _conv_info = conv_info;
786  _depth_multiplier = depth_multiplier;
787  _dilation = dilation;
788  _has_biases = (biases != nullptr);
789 
790  if(is_data_type_quantized(_input->info()->data_type()))
791  {
792  const auto input_scale = input->info()->quantization_info().uniform().scale;
793  const auto output_scale = output->info()->quantization_info().uniform().scale;
794 
795  auto weights_scale = weights->info()->quantization_info().scale();
797  {
798  for(size_t i = 1; i < _weights->info()->dimension(channel_idx); ++i)
799  {
800  weights_scale.push_back(weights_scale.front());
801  }
802  }
803 
804  for(const auto &s : weights_scale)
805  {
806  int32_t out_mult = 0;
807  int32_t out_shift = 0;
808  const float multiplier = input_scale * s / output_scale;
809  arm_compute::quantization::calculate_quantized_multiplier(multiplier, &out_mult, &out_shift);
810 
811  _output_multiplier.push_back(out_mult);
812  _output_shift.push_back(out_shift);
813  }
814  }
815 
816  switch(_weights->info()->data_type())
817  {
818  case DataType::QASYMM8:
819  _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<uint8_t, uint8_t>;
820  break;
822  _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<int8_t, int8_t>;
823  break;
825  if(_input->info()->data_type() == DataType::QASYMM8)
826  {
827  _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<uint8_t, int8_t>;
828  }
829  else
830  {
831  _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<int8_t, int8_t>;
832  }
833  break;
834 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
835  case DataType::F16:
836  _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<float16_t, float16_t>;
837  break;
838 #endif // __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
839  case DataType::F32:
840  _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<float, float>;
841  break;
842  default:
843  ARM_COMPUTE_ERROR("Data type not supported");
844  break;
845  }
846 
847  const TensorShape output_shape = misc::shape_calculator::compute_depthwise_convolution_shape(*input->info(), *weights->info(), conv_info, depth_multiplier, dilation);
848  auto_init_if_empty(*output->info(), input->info()->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(output_shape).set_quantization_info(output->info()->quantization_info()));
849 
850  Window win = calculate_max_window(*output->info(), Steps());
851  Coordinates coord;
852  coord.set_num_dimensions(output->info()->num_dimensions());
853  output->info()->set_valid_region(ValidRegion(coord, output->info()->tensor_shape()));
854  INEKernel::configure(win);
855 }
bool is_data_type_quantized(DataType dt)
Check if a given data type is of quantized type.
Definition: Utils.h:1168
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps, bool skip_border, BorderSize border_size)
TensorShape compute_depthwise_convolution_shape(const ITensorInfo &input, const ITensorInfo &weights, PadStrideInfo conv_info, unsigned int depth_multiplier, const Size2D &dilation=Size2D(1U, 1U))
Calculate the depthwise convolution output shape of a tensor.
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
Status calculate_quantized_multiplier(float multiplier, int32_t *quant_multiplier, int32_t *shift, bool ignore_epsilon=false)
Calculate quantized representation of multiplier.
1 channel, 1 F16 per channel
bool is_data_type_quantized_per_channel(DataType dt)
Check if a given data type is of per channel type.
Definition: Utils.h:1245
quantized, asymmetric fixed-point 8-bit number unsigned
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
quantized, symmetric per channel fixed-point 8-bit number
Status validate_arguments(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo *output_stage)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
quantized, asymmetric fixed-point 8-bit number signed

◆ name()

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

◆ operator=() [2/2]

Default move assignment operator.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 865 of file NEDepthwiseConvolutionLayerNativeKernel.cpp.

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensorInfo::data_type(), ITensor::info(), arm_compute::is_data_type_quantized_per_channel(), and IKernel::window().

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

866 {
870 
871  (this->*_func)(window, _has_biases);
872 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
unsigned int  depth_multiplier = 1,
const Size2D dilation = Size2D(1U, 1U) 
)
static

Static function to check if given info will lead to a valid configuration of NEDepthwiseConvolutionLayerNativeKernel.

Note
Supported data layouts: NHWC
Parameters
[in]inputSource tensor info. DataType supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor info. This is a 3D tensor with dimensions [IFM, W, H]. Data type supported: Same as input or QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL when input is QASYMM8/QASYMM8_SIGNED.
[in]biasesBiases tensor info. A 1D tensor with dimensions [IFM]. Must be nullptr if not needed. Data type supported: Same as input, S32 when input is QASYMM8/QASYMM8_SIGNED.
[in]outputDestination tensor info. Data type supported: Same as input.
[in]conv_infoPadding and stride information to use for the convolution.
[in]depth_multiplier(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
Returns
a status

Definition at line 857 of file NEDepthwiseConvolutionLayerNativeKernel.cpp.

References ARM_COMPUTE_RETURN_ON_ERROR, and arm_compute::validate_arguments().

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

860 {
861  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input, weights, biases, output, conv_info, depth_multiplier, dilation));
862  return Status{};
863 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
Status validate_arguments(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo *output_stage)

The documentation for this class was generated from the following files: