Neon kernel used to compute the row-vectors of sums of all the entries in each column of Matrix B. More...

#include <NEGEMMLowpReductionKernel.h>

Collaboration diagram for NEGEMMLowpMatrixBReductionKernel:

Public Member Functions
const char *	name () const override
	Name of the kernel. More...

	NEGEMMLowpMatrixBReductionKernel ()=default
	Default constructor. More...

	NEGEMMLowpMatrixBReductionKernel (const NEGEMMLowpMatrixBReductionKernel &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

NEGEMMLowpMatrixBReductionKernel &	operator= (const NEGEMMLowpMatrixBReductionKernel &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	NEGEMMLowpMatrixBReductionKernel (NEGEMMLowpMatrixBReductionKernel &&)=default
	Allow instances of this class to be moved. More...

NEGEMMLowpMatrixBReductionKernel &	operator= (NEGEMMLowpMatrixBReductionKernel &&)=default
	Allow instances of this class to be moved. More...

	~NEGEMMLowpMatrixBReductionKernel ()=default
	Default destructor. More...

void	configure (const ITensor mtx_b, ITensor vector_sum_col, const GEMMLowpReductionKernelInfo &info) override
	Initialise the kernel's input and output. More...

void	run (const Window &window, const ThreadInfo &info) override
	Execute the kernel on the passed window. More...

Public Member Functions inherited from INEGEMMLowpReductionKernel
	INEGEMMLowpReductionKernel ()
	Constructor. More...

	INEGEMMLowpReductionKernel (const INEGEMMLowpReductionKernel &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

INEGEMMLowpReductionKernel &	operator= (const INEGEMMLowpReductionKernel &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	INEGEMMLowpReductionKernel (INEGEMMLowpReductionKernel &&)=default
	Allow instances of this class to be moved. More...

INEGEMMLowpReductionKernel &	operator= (INEGEMMLowpReductionKernel &&)=default
	Allow instances of this class to be moved. More...

virtual	~INEGEMMLowpReductionKernel ()=default
	Default destructor. More...

Public Member Functions inherited from ICPPKernel
virtual	~ICPPKernel ()=default
	Default destructor. More...

virtual void	run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
	legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...

virtual void	run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
	Execute the kernel on the passed window. More...

Public Member Functions inherited from IKernel
	IKernel ()
	Constructor. More...

virtual	~IKernel ()=default
	Destructor. More...

virtual bool	is_parallelisable () const
	Indicates whether or not the kernel is parallelisable. More...

virtual BorderSize	border_size () const
	The size of the border for that kernel. More...

const Window &	window () const
	The maximum window the kernel can be executed on. More...

Static Public Member Functions
static Status	validate (const ITensorInfo mtx_b, const ITensorInfo vector_sum_col, const GEMMLowpReductionKernelInfo &info)
	Static function to check if given info will lead to a valid configuration of NEGEMMLowpMatrixBReductionKernel. More...

Detailed Description

Neon kernel used to compute the row-vectors of sums of all the entries in each column of Matrix B.

Note: This stage is needed to handle the offset of matrix product https://github.com/google/gemmlowp/blob/master/doc/low-precision.md

Definition at line 138 of file NEGEMMLowpReductionKernel.h.

Constructor & Destructor Documentation

◆ NEGEMMLowpMatrixBReductionKernel() [1/3]

NEGEMMLowpMatrixBReductionKernel ( )

default

Default constructor.

◆ NEGEMMLowpMatrixBReductionKernel() [2/3]

NEGEMMLowpMatrixBReductionKernel ( const NEGEMMLowpMatrixBReductionKernel & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEGEMMLowpMatrixBReductionKernel() [3/3]

NEGEMMLowpMatrixBReductionKernel ( NEGEMMLowpMatrixBReductionKernel && )

default

Allow instances of this class to be moved.

◆ ~NEGEMMLowpMatrixBReductionKernel()

~NEGEMMLowpMatrixBReductionKernel ( )

default

Default destructor.

Member Function Documentation

◆ configure()

void configure	(	const ITensor *	mtx_b,
		ITensor *	vector_sum_col,
		const GEMMLowpReductionKernelInfo &	info
	)

overridevirtual

Initialise the kernel's input and output.

Parameters

[in]	mtx_b	Input tensor. Data type supported: Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8/QSYMM8_PER_CHANNEL
[out]	vector_sum_col	Output row-vector of sums of all the entries in each column of mtx_b. Data type supported: S32
[in]	info	Kernel metadata: k (num_mtx_b_rows) Number of matrix B rows. is_reshaped (is_transposed1xW) True if the input tensor is transposed 1xW. scalar Scalar value to multiply each reduced row by. mul_byscalar True if each reduced row must be multiplied by a scalar value.

Implements INEGEMMLowpReductionKernel.

Definition at line 186 of file NEGEMMLowpReductionKernel.cpp.

References ARM_COMPUTE_ERROR_ON_MSG, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window_horizontal(), ITensorInfo::dimension(), ITensor::info(), GEMMLowpReductionKernelInfo::is_reshaped, GEMMLowpReductionKernelInfo::k, GEMMLowpReductionKernelInfo::mul_by_scalar, num_elems_processed_per_iteration, arm_compute::S32, GEMMLowpReductionKernelInfo::scalar, ITensorInfo::set_valid_region(), and ITensorInfo::tensor_shape().

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(mtx_b, vector_sum_col);
     ARM_COMPUTE_ERROR_ON_MSG(info.is_reshaped == true, "Not supported");
 
     ARM_COMPUTE_ERROR_THROW_ON(validate_arguments_matrix_b_reduction(mtx_b->info(), vector_sum_col->info()));
 
     _input         = mtx_b;
     _output        = vector_sum_col;
     _k             = info.k;
     _scalar        = info.scalar;
     _mul_by_scalar = info.mul_by_scalar;
 
     // Configure kernel window
     constexpr unsigned int num_elems_processed_per_iteration = 16;
 
     // Output auto initialization if not yet initialized
     auto_init_if_empty(*_output->info(), TensorShape(_input->info()->dimension(0)), 1, DataType::S32);
 
     // Configure kernel window
     Window win = calculate_max_window_horizontal(*_output->info(), Steps(num_elems_processed_per_iteration));
     _output->info()->set_valid_region(ValidRegion(Coordinates(), _output->info()->tensor_shape()));
     INEKernel::configure(win);
 }

◆ name()

const char* name ( ) const

inlineoverridevirtual

Name of the kernel.

Returns: Kernel name

Implements ICPPKernel.

Definition at line 141 of file NEGEMMLowpReductionKernel.h.

References INEGEMMLowpReductionKernel::configure(), arm_compute::test::validation::info, INEGEMMLowpReductionKernel::operator=(), ICPPKernel::run(), arm_compute::validate(), and IKernel::window().

     {
         return "NEGEMMLowpMatrixBReductionKernel";
     }

◆ operator=() [1/2]

NEGEMMLowpMatrixBReductionKernel& operator= ( const NEGEMMLowpMatrixBReductionKernel & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEGEMMLowpMatrixBReductionKernel& operator= ( NEGEMMLowpMatrixBReductionKernel && )

default

Allow instances of this class to be moved.

◆ run()

void run	(	const Window &	window,
		const ThreadInfo &	info
	)

overridevirtual

Execute the kernel on the passed window.

Warning: If is_parallelisable() returns false then the passed window must be equal to window()

Note: The window has to be a region within the window returned by the window() method; The width of the window has to be a multiple of num_elems_processed_per_iteration().

Parameters

[in]	window	Region on which to execute the kernel. (Must be a region of the window returned by window())
[in]	info	Info about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 365 of file NEGEMMLowpReductionKernel.cpp.

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensorInfo::data_type(), ITensor::info(), arm_compute::test::validation::info, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8, arm_compute::QSYMM8_PER_CHANNEL, and IKernel::window().

 {
     ARM_COMPUTE_UNUSED(info);
     ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(this);
     ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(INEKernel::window(), window);
 
     switch(_input->info()->data_type())
     {
         case DataType::QASYMM8:
             run_internal<uint8_t>(window, info);
             break;
         case DataType::QASYMM8_SIGNED:
         case DataType::QSYMM8:
         case DataType::QSYMM8_PER_CHANNEL:
             run_internal<int8_t>(window, info);
             break;
         default:
             ARM_COMPUTE_ERROR("Unsupported data type");
     }
 }

◆ validate()

Status validate	(	const ITensorInfo *	mtx_b,
		const ITensorInfo *	vector_sum_col,
		const GEMMLowpReductionKernelInfo &	info
	)

static

Static function to check if given info will lead to a valid configuration of NEGEMMLowpMatrixBReductionKernel.

Parameters

[in]	mtx_b	Input tensor. Data type supported: Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8/QSYMM8_PER_CHANNEL
[in]	vector_sum_col	Output row-vector of sums of all the entries in each column of mtx_b. Data type supported: S32
[in]	info	Kernel metadata: k (num_mtx_b_rows) Number of matrix B rows. is_reshaped (is_transposed1xW) True if the input tensor is transposed 1xW. scalar Scalar value to multiply each reduced row by. mul_byscalar True if each reduced row must be multiplied by a scalar value.

Returns: a status

Definition at line 211 of file NEGEMMLowpReductionKernel.cpp.

Referenced by NEGEMMLowpMatrixMultiplyCore::validate().

 {
     ARM_COMPUTE_UNUSED(info);
     ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments_matrix_b_reduction(mtx_b, vector_sum_col));
 
     return Status{};
 }

The documentation for this class was generated from the following files:

src/core/NEON/kernels/NEGEMMLowpReductionKernel.h
src/core/NEON/kernels/NEGEMMLowpReductionKernel.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ NEGEMMLowpMatrixBReductionKernel() [1/3]

◆ NEGEMMLowpMatrixBReductionKernel() [2/3]

◆ NEGEMMLowpMatrixBReductionKernel() [3/3]

◆ ~NEGEMMLowpMatrixBReductionKernel()

Member Function Documentation

◆ configure()

◆ name()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ run()

◆ validate()