Basic function to execute GEMM. More...

#include <NEGEMM.h>

Collaboration diagram for NEGEMM:

Public Member Functions
	NEGEMM (std::shared_ptr< IMemoryManager > memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
	Constructor. More...

	NEGEMM (const NEGEMM &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	NEGEMM (NEGEMM &&)=default
	Default move constructor. More...

NEGEMM &	operator= (const NEGEMM &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

NEGEMM &	operator= (NEGEMM &&)=default
	Default move assignment operator. More...

	~NEGEMM ()
	Default destructor. More...

void	configure (const ITensor a, const ITensor b, const ITensor c, ITensor d, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
	Initialise the kernel's inputs, output. More...

void	run () override
	Run the kernels contained in the function. More...

void	prepare () override
	Prepare the function for executing. More...

Public Member Functions inherited from IFunction
virtual	~IFunction ()=default
	Destructor. More...

Static Public Member Functions
static Status	validate (const ITensorInfo a, const ITensorInfo b, const ITensorInfo c, const ITensorInfo output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
	Static function to check if given info will lead to a valid configuration of NEGEMM. More...

static Status	has_opt_impl (arm_compute::WeightFormat &expected_weight_format, const ITensorInfo a, const ITensorInfo b, const ITensorInfo c, const ITensorInfo output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
	Static function that queries whether there exists fixed-format kernel and if it exists it will return in the first argument in what format weights are expected to be reshaped as defined by WeightFormat class. More...

Detailed Description

Basic function to execute GEMM.

This function calls the following kernels:

cpu::CpuGemm

Definition at line 40 of file NEGEMM.h.

Constructor & Destructor Documentation

◆ NEGEMM() [1/3]

NEGEMM	(	std::shared_ptr< IMemoryManager >	memory_manager = `nullptr`,
		IWeightsManager *	weights_manager = `nullptr`
	)

Constructor.

Definition at line 56 of file NEGEMM.cpp.

     : _impl(std::make_unique<Impl>())
 {
     _impl->memory_group    = MemoryGroup(std::move(memory_manager));
     _impl->weights_manager = weights_manager;
 }

◆ NEGEMM() [2/3]

NEGEMM ( const NEGEMM & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEGEMM() [3/3]

NEGEMM ( NEGEMM && )

default

Default move constructor.

◆ ~NEGEMM()

~NEGEMM ( )

default

Default destructor.

Member Function Documentation

◆ configure()

void configure	(	const ITensor *	a,
		const ITensor *	b,
		const ITensor *	c,
		ITensor *	d,
		float	alpha,
		float	beta,
		const GEMMInfo &	gemm_info = `GEMMInfo()`
	)

Initialise the kernel's inputs, output.

Valid data layouts:

All

Valid data type configurations:

src0	src1	src2	dst
F32	F32	F32	F32
F16	F16	F16	F16
BFLOAT16	BFLOAT16	BFLOAT16	BFLOAT16

Note: GEMM: General Matrix Multiply - [alpha * A * B + beta * C].; GEMM: The tensors a, b, c, d must have the same data type. You should not mix data types when calling this function.; Batched GEMM only supports broadcasting cases where RHS rank < LHS rank but not the other way around

Parameters

[in]	a	First input tensor (Matrix A or Vector A). Data type supported: BFLOAT16/F16/F32
[in]	b	Second input tensor (Matrix B). Data type supported: same as `a`
[in]	c	Third input tensor (Matrix C). It can be a nullptr if just the multiplication between `a` and `b` is needed. Data type supported: same as `a`
[out]	d	Output tensor. Data type supported: same as `a`
[in]	alpha	Weight of the matrix product
[in]	beta	Weight of matrix C
[in]	gemm_info	(Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should happen only for the first run

Definition at line 65 of file NEGEMM.cpp.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(a, b, d);
     ARM_COMPUTE_ERROR_THROW_ON(cpu::CpuGemm::validate(a->info(), b->info(), (c != nullptr) ? c->info() : nullptr,
                                                       d->info(), alpha, beta, gemm_info));
  
     // Check if we need to reshape the matrix B only on the first run
     _impl->is_prepared = false;
     _impl->original_b  = b;
     _impl->op          = std::make_unique<cpu::CpuGemm>();
  
     // Make the B matrix dynamic values.
     auto b_info_to_use = b->info()->clone();
     if (!gemm_info.reshape_b_only_on_first_run())
     {
         b_info_to_use->set_are_values_constant(false);
     }
  
     _impl->op->configure(a->info(), b_info_to_use.get(), (c != nullptr) ? c->info() : nullptr, d->info(), alpha, beta,
                          gemm_info);
  
     _impl->aux_mem_req = _impl->op->workspace();
     _impl->run_pack    = {{ACL_SRC_0, a}, {ACL_SRC_1, b}, {ACL_SRC_2, c}, {ACL_DST, d}};
     _impl->prep_pack   = {{ACL_SRC_1, b}, {ACL_SRC_2, c}};
     _impl->workspace =
         manage_workspace<Tensor>(_impl->aux_mem_req, _impl->memory_group, _impl->run_pack, _impl->prep_pack);
 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::test::validation::b, ITensor::info(), GEMMInfo::reshape_b_only_on_first_run(), and CpuGemm::validate().

Referenced by NERNNLayer::configure(), and NELSTMLayer::configure().

◆ has_opt_impl()

Status has_opt_impl	(	arm_compute::WeightFormat &	expected_weight_format,
		const ITensorInfo *	a,
		const ITensorInfo *	b,
		const ITensorInfo *	c,
		const ITensorInfo *	output,
		float	alpha,
		float	beta,
		const GEMMInfo &	gemm_info = `GEMMInfo()`
	)

static

Static function that queries whether there exists fixed-format kernel and if it exists it will return in the first argument in what format weights are expected to be reshaped as defined by WeightFormat class.

Apart from the first argument the rest of the arguments are the same as in NEGEMM::validate() except that all arguments are required.

Returns: a status

Definition at line 117 of file NEGEMM.cpp.

 {
     ARM_COMPUTE_UNUSED(alpha, beta);
     return cpu::CpuGemm::has_opt_impl(expected_weight_format, a, b, c, output, gemm_info);
 }

References ARM_COMPUTE_UNUSED, arm_compute::test::validation::b, and CpuGemm::has_opt_impl().

◆ operator=() [1/2]

NEGEMM& operator= ( const NEGEMM & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEGEMM& operator= ( NEGEMM && )

default

Default move assignment operator.

◆ prepare()

void prepare ( )

overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note: Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 138 of file NEGEMM.cpp.

 {
     if (!_impl->is_prepared)
     {
         _impl->op->prepare(_impl->prep_pack);
  
         auto has_reshape =
             std::find_if(_impl->aux_mem_req.begin(), _impl->aux_mem_req.end(),
                          [](const MemoryInfo &m) -> bool { return m.lifetime == MemoryLifetime::Persistent; });
  
         if (has_reshape != std::end(_impl->aux_mem_req))
         {
             _impl->original_b->mark_as_unused();
         }
         else
         {
             _impl->run_pack.add_const_tensor(ACL_SRC_1, _impl->original_b);
         }
  
         // Release temporary tensors that are only used in prepare stage
         release_temporaries<Tensor>(_impl->aux_mem_req, _impl->workspace);
         _impl->is_prepared = true;
     }
 }

References arm_compute::ACL_SRC_1, and arm_compute::mlgo::parser::end().

Referenced by NERNNLayer::prepare(), and NEGEMM::run().

◆ run()

void run ( )

overridevirtual

Run the kernels contained in the function.

For CPU kernels:

Multi-threading is used for the kernels which are parallelisable.
By default std::thread::hardware_concurrency() threads are used.

Note: CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

All the kernels are enqueued on the queue associated with CLScheduler.
The queue is then flushed.

Note: The function will not block until the kernels are executed. It is the user's responsibility to wait.; Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 130 of file NEGEMM.cpp.

 {
     prepare();
  
     MemoryGroupResourceScope scope_mg(_impl->memory_group);
     _impl->op->run(_impl->run_pack);
 }

References NEGEMM::prepare().

Referenced by NERNNLayer::run(), and NELSTMLayer::run().

◆ validate()

Status validate	(	const ITensorInfo *	a,
		const ITensorInfo *	b,
		const ITensorInfo *	c,
		const ITensorInfo *	output,
		float	alpha,
		float	beta,
		const GEMMInfo &	gemm_info = `GEMMInfo()`
	)

static

Static function to check if given info will lead to a valid configuration of NEGEMM.

Similar to NEGEMM::configure()

Returns: a status

Definition at line 99 of file NEGEMM.cpp.

 {
     // Make the B matrix dynamic values.
     auto b_to_use = b->clone();
     if (!gemm_info.reshape_b_only_on_first_run())
     {
         b_to_use->set_are_values_constant(false);
     }
  
     return cpu::CpuGemm::validate(a, b_to_use.get(), c, output, alpha, beta, gemm_info);
 }

References arm_compute::test::validation::b, GEMMInfo::reshape_b_only_on_first_run(), and CpuGemm::validate().

Referenced by NELSTMLayer::validate().

The documentation for this class was generated from the following files:

arm_compute/runtime/NEON/functions/NEGEMM.h
src/runtime/NEON/functions/NEGEMM.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ NEGEMM() [1/3]

◆ NEGEMM() [2/3]

◆ NEGEMM() [3/3]

◆ ~NEGEMM()

Member Function Documentation

◆ configure()

◆ has_opt_impl()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ prepare()

◆ run()

◆ validate()