Interface for the kernel to run a depthwise convolution native on a tensor. More...

#include <NEDepthwiseConvolutionLayerNativeKernel.h>

Collaboration diagram for NEDepthwiseConvolutionLayerNativeKernel:

Public Member Functions
const char *	name () const override
	Name of the kernel. More...

	NEDepthwiseConvolutionLayerNativeKernel ()
	Default constructor. More...

	NEDepthwiseConvolutionLayerNativeKernel (const NEDepthwiseConvolutionLayerNativeKernel &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

NEDepthwiseConvolutionLayerNativeKernel &	operator= (const NEDepthwiseConvolutionLayerNativeKernel &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	NEDepthwiseConvolutionLayerNativeKernel (NEDepthwiseConvolutionLayerNativeKernel &&)=default
	Default Move Constructor. More...

NEDepthwiseConvolutionLayerNativeKernel &	operator= (NEDepthwiseConvolutionLayerNativeKernel &&)=default
	Default move assignment operator. More...

	~NEDepthwiseConvolutionLayerNativeKernel ()=default
	Default destructor. More...

void	configure (const ITensor input, const ITensor weights, const ITensor biases, ITensor output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, const Size2D &dilation=Size2D(1U, 1U))
	Initialize the function's source, destination and parameters. More...

void	run (const Window &window, const ThreadInfo &info) override
	Execute the kernel on the passed window. More...

Public Member Functions inherited from ICPPKernel
virtual	~ICPPKernel ()=default
	Default destructor. More...

virtual void	run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
	legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...

virtual void	run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
	Execute the kernel on the passed window. More...

Public Member Functions inherited from IKernel
	IKernel ()
	Constructor. More...

virtual	~IKernel ()=default
	Destructor. More...

virtual bool	is_parallelisable () const
	Indicates whether or not the kernel is parallelisable. More...

virtual BorderSize	border_size () const
	The size of the border for that kernel. More...

const Window &	window () const
	The maximum window the kernel can be executed on. More...

Static Public Member Functions
static Status	validate (const ITensorInfo input, const ITensorInfo weights, const ITensorInfo biases, const ITensorInfo output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, const Size2D &dilation=Size2D(1U, 1U))
	Static function to check if given info will lead to a valid configuration of NEDepthwiseConvolutionLayerNativeKernel. More...

Detailed Description

Interface for the kernel to run a depthwise convolution native on a tensor.

Definition at line 41 of file NEDepthwiseConvolutionLayerNativeKernel.h.

Constructor & Destructor Documentation

◆ NEDepthwiseConvolutionLayerNativeKernel() [1/3]

NEDepthwiseConvolutionLayerNativeKernel ( )

Default constructor.

Definition at line 770 of file NEDepthwiseConvolutionLayerNativeKernel.cpp.

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

     : _func(), _input(), _weights(), _biases(), _output(), _conv_info(), _depth_multiplier(1), _dilation(), _output_multiplier(), _output_shift(), _has_biases()
 {
 }

◆ NEDepthwiseConvolutionLayerNativeKernel() [2/3]

NEDepthwiseConvolutionLayerNativeKernel ( const NEDepthwiseConvolutionLayerNativeKernel & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEDepthwiseConvolutionLayerNativeKernel() [3/3]

NEDepthwiseConvolutionLayerNativeKernel ( NEDepthwiseConvolutionLayerNativeKernel && )

default

Default Move Constructor.

◆ ~NEDepthwiseConvolutionLayerNativeKernel()

~NEDepthwiseConvolutionLayerNativeKernel ( )

default

Default destructor.

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

Member Function Documentation

◆ configure()

void configure	(	const ITensor *	input,
		const ITensor *	weights,
		const ITensor *	biases,
		ITensor *	output,
		const PadStrideInfo &	conv_info,
		unsigned int	depth_multiplier = `1`,
		const Size2D &	dilation = `Size2D(1U, 1U)`
	)

Initialize the function's source, destination and parameters.

Note: Supported data layouts: NHWC

Parameters

[in]	input	Source tensor. DataType supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]	weights	Weights tensor. This is a 3D tensor with dimensions [IFM, W, H]. Data type supported: Same as `input` or QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL when `input` is QASYMM8/QASYMM8_SIGNED.
[in]	biases	Biases tensor. A 1D tensor with dimensions [IFM]. Must be nullptr if not needed. Data type supported: Same as `input`, S32 when input is QASYMM8/QASYMM8_SIGNED.
[out]	output	Destination tensor. Data type supported: Same as `input`.
[in]	conv_info	Padding and stride information to use for the convolution.
[in]	depth_multiplier	(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]	dilation	(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).

Definition at line 775 of file NEDepthwiseConvolutionLayerNativeKernel.cpp.

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window(), arm_compute::quantization::calculate_quantized_multiplier(), ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_depthwise_convolution_shape(), arm_compute::test::validation::conv_info, ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, ITensor::info(), arm_compute::test::validation::input, arm_compute::is_data_type_quantized(), arm_compute::is_data_type_quantized_per_channel(), ITensorInfo::num_dimensions(), arm_compute::test::validation::output_shape, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8_PER_CHANNEL, ITensorInfo::quantization_info(), UniformQuantizationInfo::scale, QuantizationInfo::scale(), Dimensions< T >::set_num_dimensions(), ITensorInfo::set_valid_region(), ITensorInfo::tensor_shape(), QuantizationInfo::uniform(), and arm_compute::validate_arguments().

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name(), and arm_compute::test::validation::TEST_CASE().

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
     ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), weights->info(), (biases != nullptr) ? biases->info() : nullptr, output->info(), conv_info, depth_multiplier, dilation));
 
     _input            = input;
     _weights          = weights;
     _biases           = biases;
     _output           = output;
     _conv_info        = conv_info;
     _depth_multiplier = depth_multiplier;
     _dilation         = dilation;
     _has_biases       = (biases != nullptr);
 
     if(is_data_type_quantized(_input->info()->data_type()))
     {
         const auto input_scale  = input->info()->quantization_info().uniform().scale;
         const auto output_scale = output->info()->quantization_info().uniform().scale;
 
         auto weights_scale = weights->info()->quantization_info().scale();
         if(!is_data_type_quantized_per_channel(_weights->info()->data_type()))
         {
             for(size_t i = 1; i < _weights->info()->dimension(channel_idx); ++i)
             {
                 weights_scale.push_back(weights_scale.front());
             }
         }
 
         for(const auto &s : weights_scale)
         {
             int32_t     out_mult   = 0;
             int32_t     out_shift  = 0;
             const float multiplier = input_scale * s / output_scale;
             arm_compute::quantization::calculate_quantized_multiplier(multiplier, &out_mult, &out_shift);
 
             _output_multiplier.push_back(out_mult);
             _output_shift.push_back(out_shift);
         }
     }
 
     switch(_weights->info()->data_type())
     {
         case DataType::QASYMM8:
             _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<uint8_t, uint8_t>;
             break;
         case DataType::QASYMM8_SIGNED:
             _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<int8_t, int8_t>;
             break;
         case DataType::QSYMM8_PER_CHANNEL:
             if(_input->info()->data_type() == DataType::QASYMM8)
             {
                 _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<uint8_t, int8_t>;
             }
             else
             {
                 _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<int8_t, int8_t>;
             }
             break;
 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
         case DataType::F16:
             _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<float16_t, float16_t>;
             break;
 #endif // __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
         case DataType::F32:
             _func = &NEDepthwiseConvolutionLayerNativeKernel::run_depthwise<float, float>;
             break;
         default:
             ARM_COMPUTE_ERROR("Data type not supported");
             break;
     }
 
     const TensorShape output_shape = misc::shape_calculator::compute_depthwise_convolution_shape(*input->info(), *weights->info(), conv_info, depth_multiplier, dilation);
     auto_init_if_empty(*output->info(), input->info()->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(output_shape).set_quantization_info(output->info()->quantization_info()));
 
     Window      win = calculate_max_window(*output->info(), Steps());
     Coordinates coord;
     coord.set_num_dimensions(output->info()->num_dimensions());
     output->info()->set_valid_region(ValidRegion(coord, output->info()->tensor_shape()));
     INEKernel::configure(win);
 }

◆ name()

const char* name ( ) const

inlineoverridevirtual

Name of the kernel.

Returns: Kernel name

Implements ICPPKernel.

Definition at line 44 of file NEDepthwiseConvolutionLayerNativeKernel.h.

References NEDepthwiseConvolutionLayerNativeKernel::configure(), arm_compute::test::validation::conv_info, arm_compute::test::validation::info, arm_compute::test::validation::input, NEDepthwiseConvolutionLayerNativeKernel::NEDepthwiseConvolutionLayerNativeKernel(), NEDepthwiseConvolutionLayerNativeKernel::operator=(), NEDepthwiseConvolutionLayerNativeKernel::run(), arm_compute::U, NEDepthwiseConvolutionLayerNativeKernel::validate(), IKernel::window(), and NEDepthwiseConvolutionLayerNativeKernel::~NEDepthwiseConvolutionLayerNativeKernel().

     {
         return "NEDepthwiseConvolutionLayerNativeKernel";
     }

◆ operator=() [1/2]

NEDepthwiseConvolutionLayerNativeKernel& operator= ( const NEDepthwiseConvolutionLayerNativeKernel & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

◆ operator=() [2/2]

NEDepthwiseConvolutionLayerNativeKernel& operator= ( NEDepthwiseConvolutionLayerNativeKernel && )

default

Default move assignment operator.

◆ run()

void run	(	const Window &	window,
		const ThreadInfo &	info
	)

overridevirtual

Execute the kernel on the passed window.

Warning: If is_parallelisable() returns false then the passed window must be equal to window()

Note: The window has to be a region within the window returned by the window() method; The width of the window has to be a multiple of num_elems_processed_per_iteration().

Parameters

[in]	window	Region on which to execute the kernel. (Must be a region of the window returned by window())
[in]	info	Info about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 865 of file NEDepthwiseConvolutionLayerNativeKernel.cpp.

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensorInfo::data_type(), ITensor::info(), arm_compute::is_data_type_quantized_per_channel(), and IKernel::window().

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

 {
     ARM_COMPUTE_UNUSED(info);
     ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(this);
     ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(INEKernel::window(), window);
 
     (this->*_func)(window, _has_biases);
 }

◆ validate()

Status validate	(	const ITensorInfo *	input,
		const ITensorInfo *	weights,
		const ITensorInfo *	biases,
		const ITensorInfo *	output,
		const PadStrideInfo &	conv_info,
		unsigned int	depth_multiplier = `1`,
		const Size2D &	dilation = `Size2D(1U, 1U)`
	)

static

Static function to check if given info will lead to a valid configuration of NEDepthwiseConvolutionLayerNativeKernel.

Note: Supported data layouts: NHWC

Parameters

[in]	input	Source tensor info. DataType supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]	weights	Weights tensor info. This is a 3D tensor with dimensions [IFM, W, H]. Data type supported: Same as `input` or QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL when `input` is QASYMM8/QASYMM8_SIGNED.
[in]	biases	Biases tensor info. A 1D tensor with dimensions [IFM]. Must be nullptr if not needed. Data type supported: Same as `input`, S32 when input is QASYMM8/QASYMM8_SIGNED.
[in]	output	Destination tensor info. Data type supported: Same as `input`.
[in]	conv_info	Padding and stride information to use for the convolution.
[in]	depth_multiplier	(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]	dilation	(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).

Returns: a status

Definition at line 857 of file NEDepthwiseConvolutionLayerNativeKernel.cpp.

References ARM_COMPUTE_RETURN_ON_ERROR, and arm_compute::validate_arguments().

Referenced by NEDepthwiseConvolutionLayerNativeKernel::name().

 {
     ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input, weights, biases, output, conv_info, depth_multiplier, dilation));
     return Status{};
 }

The documentation for this class was generated from the following files:

src/core/NEON/kernels/NEDepthwiseConvolutionLayerNativeKernel.h
src/core/NEON/kernels/NEDepthwiseConvolutionLayerNativeKernel.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ NEDepthwiseConvolutionLayerNativeKernel() [1/3]

◆ NEDepthwiseConvolutionLayerNativeKernel() [2/3]

◆ NEDepthwiseConvolutionLayerNativeKernel() [3/3]

◆ ~NEDepthwiseConvolutionLayerNativeKernel()

Member Function Documentation

◆ configure()

◆ name()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ run()

◆ validate()