Function to run Gemm on quantized types. More...

#include <NEGEMMLowpMatrixMultiplyCore.h>

Collaboration diagram for NEGEMMLowpMatrixMultiplyCore:

Public Member Functions
	NEGEMMLowpMatrixMultiplyCore (std::shared_ptr< IMemoryManager > memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
	Constructor. More...

	NEGEMMLowpMatrixMultiplyCore (const NEGEMMLowpMatrixMultiplyCore &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	NEGEMMLowpMatrixMultiplyCore (NEGEMMLowpMatrixMultiplyCore &&)=default
	Default move constructor. More...

NEGEMMLowpMatrixMultiplyCore &	operator= (const NEGEMMLowpMatrixMultiplyCore &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

NEGEMMLowpMatrixMultiplyCore &	operator= (NEGEMMLowpMatrixMultiplyCore &&)=default
	Default move assignment operator. More...

	~NEGEMMLowpMatrixMultiplyCore ()
	Default destructor. More...

void	configure (const ITensor a, const ITensor b, const ITensor c, ITensor output, const GEMMInfo &gemm_info=GEMMInfo())
	Initialise the kernel's inputs, output. More...

void	run () override
	Run the kernels contained in the function. More...

void	prepare () override
	Prepare the function for executing. More...

Public Member Functions inherited from IFunction
virtual	~IFunction ()=default
	Destructor. More...

Static Public Member Functions
static Status	validate (const ITensorInfo a, const ITensorInfo b, const ITensorInfo c, const ITensorInfo output, const GEMMInfo &gemm_info=GEMMInfo())
	Static function to check if given info will lead to a valid configuration of NEGEMMLowpMatrixMultiplyCore. More...

Detailed Description

Function to run Gemm on quantized types.

This function calls the following:

cpu::CpuGemmLowpMatrixMultiplyCore

Definition at line 46 of file NEGEMMLowpMatrixMultiplyCore.h.

Constructor & Destructor Documentation

◆ NEGEMMLowpMatrixMultiplyCore() [1/3]

NEGEMMLowpMatrixMultiplyCore	(	std::shared_ptr< IMemoryManager >	memory_manager = `nullptr`,
		IWeightsManager *	weights_manager = `nullptr`
	)

Constructor.

Definition at line 53 of file NEGEMMLowpMatrixMultiplyCore.cpp.

     : _impl(std::make_unique<Impl>())
 {
     _impl->weights_manager = weights_manager;
     _impl->memory_group    = MemoryGroup(memory_manager);
 }

◆ NEGEMMLowpMatrixMultiplyCore() [2/3]

NEGEMMLowpMatrixMultiplyCore ( const NEGEMMLowpMatrixMultiplyCore & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEGEMMLowpMatrixMultiplyCore() [3/3]

NEGEMMLowpMatrixMultiplyCore ( NEGEMMLowpMatrixMultiplyCore && )

default

Default move constructor.

◆ ~NEGEMMLowpMatrixMultiplyCore()

~NEGEMMLowpMatrixMultiplyCore ( )

default

Default destructor.

Member Function Documentation

◆ configure()

void configure	(	const ITensor *	a,
		const ITensor *	b,
		const ITensor *	c,
		ITensor *	output,
		const GEMMInfo &	gemm_info = `GEMMInfo()`
	)

Initialise the kernel's inputs, output.

Valid data layouts:

NHWC
NCHW

Valid data type configurations:

src0	src1	src2	dst
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8	QSYMM8	S32	QASYMM8
QASYMM8	QASYMM8	S32	S32
QASYMM8	QSYMM8_PER_CHANNEL	S32	S32
QASYMM8	QSYMM8	S32	S32
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	S32
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	S32
QASYMM8_SIGNED	QSYMM8	S32	S32

Note: GEMM_LOWP: low precision GEMM kernel This kernel performs the following computations:

Convert a values from QASYMM8 to int32 and add a_offset to each of them.
Convert b values from QASYMM8 to int32 add b_offset to each of them.
Compute the matrix product of the resulting a * b in int32.

Note: The output type is S32 if gemm_info.type == GEMMLowpOutputStageType::NONE. It is QASYMM8/QASYMM8_SIGNED otherwise

Parameters

[in]	a	First input tensor (Matrix A). Data type supported: QASYMM8/QASYMM8_SIGNED.
[in]	b	Second input tensor (Matrix B). Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8/QSYMM8_PER_CHANNEL.
[in]	c	Third input tensor (Matrix C). It can be a nullptr. Data type supported: S32
[out]	output	Output tensor. Data type supported: Data type supported: S32/QASYMM8/QASYMM8_SIGNED
[in]	gemm_info	(Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should be executed only for the first run

Definition at line 62 of file NEGEMMLowpMatrixMultiplyCore.cpp.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(a, b, output);
  
     // Make the B matrix dynamic values.
     auto b_info_to_use = b->info()->clone();
     if (!gemm_info.reshape_b_only_on_first_run())
     {
         b_info_to_use->set_are_values_constant(false);
     }
  
     _impl->b  = b;
     _impl->op = std::make_unique<cpu::CpuGemmLowpMatrixMultiplyCore>();
     _impl->op->configure(a->info(), b_info_to_use.get(), (c != nullptr ? c->info() : nullptr), output->info(),
                          gemm_info);
     _impl->run_pack    = {{TensorType::ACL_SRC_0, a},
                           {TensorType::ACL_SRC_1, b},
                           {TensorType::ACL_SRC_2, c},
                           {TensorType::ACL_DST, output}};
     _impl->prep_pack   = {{TensorType::ACL_SRC_1, b}, {TensorType::ACL_SRC_2, c}};
     _impl->aux_mem_req = _impl->op->workspace();
     _impl->workspace_tensors =
         manage_workspace<Tensor>(_impl->aux_mem_req, _impl->memory_group, _impl->run_pack, _impl->prep_pack);
 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, ARM_COMPUTE_ERROR_ON_NULLPTR, arm_compute::test::validation::b, ITensor::info(), and GEMMInfo::reshape_b_only_on_first_run().

Referenced by NELSTMLayerQuantized::configure(), arm_compute::test::validation::DATA_TEST_CASE(), and main().

◆ operator=() [1/2]

NEGEMMLowpMatrixMultiplyCore& operator= ( const NEGEMMLowpMatrixMultiplyCore & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEGEMMLowpMatrixMultiplyCore& operator= ( NEGEMMLowpMatrixMultiplyCore && )

default

Default move assignment operator.

◆ prepare()

void prepare ( )

overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note: Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 111 of file NEGEMMLowpMatrixMultiplyCore.cpp.

 {
     if (!_impl->is_prepared)
     {
         _impl->op->prepare(_impl->prep_pack);
  
         auto has_reshape =
             std::find_if(_impl->aux_mem_req.begin(), _impl->aux_mem_req.end(),
                          [](const MemoryInfo &m) -> bool { return m.lifetime == MemoryLifetime::Persistent; });
  
         if (has_reshape != std::end(_impl->aux_mem_req))
         {
             _impl->b->mark_as_unused();
         }
  
         // Release temporary tensors that are only used in prepare stage
         release_temporaries<Tensor>(_impl->aux_mem_req, _impl->workspace_tensors);
         _impl->is_prepared = true;
     }
 }

References arm_compute::mlgo::parser::end().

Referenced by NEGEMMLowpMatrixMultiplyCore::run().

◆ run()

void run ( )

overridevirtual

Run the kernels contained in the function.

For CPU kernels:

Multi-threading is used for the kernels which are parallelisable.
By default std::thread::hardware_concurrency() threads are used.

Note: CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

All the kernels are enqueued on the queue associated with CLScheduler.
The queue is then flushed.

Note: The function will not block until the kernels are executed. It is the user's responsibility to wait.; Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 104 of file NEGEMMLowpMatrixMultiplyCore.cpp.

 {
     prepare();
     MemoryGroupResourceScope scope_mg(_impl->memory_group);
     _impl->op->run(_impl->run_pack);
 }

References NEGEMMLowpMatrixMultiplyCore::prepare().

Referenced by main(), NELSTMLayerQuantized::run(), and NEQLSTMLayer::run().

◆ validate()

Status validate	(	const ITensorInfo *	a,
		const ITensorInfo *	b,
		const ITensorInfo *	c,
		const ITensorInfo *	output,
		const GEMMInfo &	gemm_info = `GEMMInfo()`
	)

static

Static function to check if given info will lead to a valid configuration of NEGEMMLowpMatrixMultiplyCore.

Returns: a status

Definition at line 88 of file NEGEMMLowpMatrixMultiplyCore.cpp.

 {
     // Make the B matrix dynamic values.
     auto b_info_to_use = b->clone();
     if (!gemm_info.reshape_b_only_on_first_run())
     {
         b_info_to_use->set_are_values_constant(false);
     }
  
     return cpu::CpuGemmLowpMatrixMultiplyCore::validate(a, b_info_to_use.get(), c, output, gemm_info);
 }

References arm_compute::test::validation::b, GEMMInfo::reshape_b_only_on_first_run(), and CpuGemmLowpMatrixMultiplyCore::validate().

Referenced by NELSTMLayerQuantized::validate().

The documentation for this class was generated from the following files:

arm_compute/runtime/NEON/functions/NEGEMMLowpMatrixMultiplyCore.h
src/runtime/NEON/functions/NEGEMMLowpMatrixMultiplyCore.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ NEGEMMLowpMatrixMultiplyCore() [1/3]

◆ NEGEMMLowpMatrixMultiplyCore() [2/3]

◆ NEGEMMLowpMatrixMultiplyCore() [3/3]

◆ ~NEGEMMLowpMatrixMultiplyCore()

Member Function Documentation

◆ configure()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ prepare()

◆ run()

◆ validate()