Compute Library
 23.08
CLGEMM Class Reference

Basic function to execute GEMM on OpenCL. More...

#include <CLGEMM.h>

Collaboration diagram for CLGEMM:
[legend]

Public Member Functions

 CLGEMM (std::shared_ptr< IMemoryManager > memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
 Default constructor. More...
 
 ~CLGEMM ()
 Default destructor. More...
 
 CLGEMM (const CLGEMM &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLGEMM (CLGEMM &&)
 Default move constructor. More...
 
CLGEMMoperator= (const CLGEMM &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLGEMMoperator= (CLGEMM &&)
 Default move assignment operator. More...
 
void configure (const CLCompileContext &compile_context, const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
 Initialise the kernel's inputs and output. More...
 
void configure (const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
 Initialise the kernel's inputs and output. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
 Static function to check if given info will lead to a valid configuration of CLGEMM. More...
 

Detailed Description

Basic function to execute GEMM on OpenCL.

Definition at line 45 of file CLGEMM.h.

Constructor & Destructor Documentation

◆ CLGEMM() [1/3]

CLGEMM ( std::shared_ptr< IMemoryManager memory_manager = nullptr,
IWeightsManager weights_manager = nullptr 
)

Default constructor.

Parameters
[in]memory_manager(Optional) Memory manager.
[in]weights_manager(Optional) Weights manager.

Definition at line 54 of file CLGEMM.cpp.

55  : _impl(std::make_unique<Impl>())
56 {
57  _impl->memory_group = MemoryGroup(memory_manager);
58  _impl->weights_manager = weights_manager;
59 }

◆ ~CLGEMM()

~CLGEMM ( )
default

Default destructor.

◆ CLGEMM() [2/3]

CLGEMM ( const CLGEMM )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLGEMM() [3/3]

CLGEMM ( CLGEMM &&  )

Default move constructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( const CLCompileContext compile_context,
const ICLTensor a,
const ICLTensor b,
const ICLTensor c,
ICLTensor output,
float  alpha,
float  beta,
const GEMMInfo gemm_info = GEMMInfo() 
)

Initialise the kernel's inputs and output.

Valid data layouts:

  • All

Valid data type configurations:

src0 src1 src2 dst
F32 F32 F32 F32
F16 F16 F16 F16
Note
GEMM: General Matrix Multiply - [alpha * A * B + beta * C].
All tensors must have the same data type.
Whilst the first input tensor can be a vector, the second input tensor must be at least a matrix
Batched GEMM only allows RHS tensor's rank to be <= 3
Batched GEMM only supports broadcasting cases where RHS rank < LHS rank but not the other way around
Parameters
[in]compile_contextThe compile context to be used.
[in]aFirst input tensor (Matrix or Vector A). Data types supported: F16/F32
[in]bSecond input tensor (Matrix B). Data type supported: same as a.
[in]cThird input tensor (Matrix C). It can be a nullptr if just the multiplication between a and b is needed. Data type supported: same as a.
[out]outputOutput tensor. Data type supported: same as a
[in]alphaWeight of the matrix product
[in]betaWeight of matrix C
[in]gemm_info(Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should happen only for the first run. GEMMInfo also contains information about the reshaping in case matrix A and matrix B have been already transformed.

Definition at line 68 of file CLGEMM.cpp.

69 {
70  ARM_COMPUTE_ERROR_ON_NULLPTR(a, b, output);
71 
72  _impl->b = b;
73  _impl->op = std::make_unique<OperatorType>();
74  _impl->is_prepared = gemm_info.retain_internal_weights();
75 
76  _impl->op->configure(compile_context, a->info(), b->info(), c != nullptr ? c->info() : nullptr, output->info(), alpha, beta, gemm_info);
77  _impl->aux_mem_req = _impl->op->workspace();
78 
79  // Manage/allocate auxilairy tensors
80  if(_impl->is_prepared)
81  {
82  _impl->run_pack.add_const_tensor(ACL_SRC_0, a);
83  _impl->run_pack.add_tensor(ACL_DST, output);
84  }
85  else
86  {
87  _impl->run_pack = { { ACL_SRC_0, a }, { ACL_SRC_2, c }, { ACL_DST, output } };
88  _impl->prep_pack = { { ACL_SRC_1, _impl->b } };
89 
90  _impl->workspace_tensors = manage_workspace<CLTensor>(_impl->op->workspace(), _impl->memory_group, _impl->run_pack, _impl->prep_pack);
91  }
92 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, ARM_COMPUTE_ERROR_ON_NULLPTR, arm_compute::test::validation::b, arm_compute::test::validation::gemm_info, and ITensor::info().

Referenced by CLRNNLayer::configure(), CLGEMM::configure(), CLGEMMDeconvolutionLayer::configure(), and CLLSTMLayer::configure().

◆ configure() [2/2]

void configure ( const ICLTensor a,
const ICLTensor b,
const ICLTensor c,
ICLTensor output,
float  alpha,
float  beta,
const GEMMInfo gemm_info = GEMMInfo() 
)

Initialise the kernel's inputs and output.

Similar to CLGEMM::configure()

Definition at line 63 of file CLGEMM.cpp.

64 {
65  configure(CLKernelLibrary::get().get_compile_context(), a, b, c, output, alpha, beta, gemm_info);
66 }

References arm_compute::test::validation::b, CLGEMM::configure(), arm_compute::test::validation::gemm_info, and CLKernelLibrary::get().

◆ operator=() [1/2]

CLGEMM& operator= ( CLGEMM &&  )

Default move assignment operator.

◆ operator=() [2/2]

CLGEMM& operator= ( const CLGEMM )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 108 of file CLGEMM.cpp.

109 {
110  if(!_impl->is_prepared)
111  {
112  _impl->op->prepare(_impl->prep_pack);
113 
114  auto has_reshape = std::find_if(_impl->aux_mem_req.begin(),
115  _impl->aux_mem_req.end(),
116  [](const MemoryInfo & m) -> bool { return m.lifetime == MemoryLifetime::Persistent; });
117 
118  if(has_reshape != std::end(_impl->aux_mem_req))
119  {
120  _impl->b->mark_as_unused();
121  }
122  else
123  {
124  // Pack the B matrix to be used as the underlying GEMM performs no reshapes
125  _impl->run_pack.add_const_tensor(ACL_SRC_1, _impl->b);
126  }
127  _impl->is_prepared = true;
128  }
129 }

References arm_compute::ACL_SRC_1, arm_compute::mlgo::parser::end(), and arm_compute::test::validation::m.

Referenced by CLRNNLayer::prepare(), CLGEMMDeconvolutionLayer::prepare(), and CLGEMM::run().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For CPU kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 99 of file CLGEMM.cpp.

100 {
101  prepare();
102 
103  MemoryGroupResourceScope scope_mg(_impl->memory_group);
104 
105  _impl->op->run(_impl->run_pack);
106 }

References CLGEMM::prepare().

Referenced by CLRNNLayer::run(), CLGEMMDeconvolutionLayer::run(), and CLLSTMLayer::run().

◆ validate()

Status validate ( const ITensorInfo a,
const ITensorInfo b,
const ITensorInfo c,
const ITensorInfo output,
float  alpha,
float  beta,
const GEMMInfo gemm_info = GEMMInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CLGEMM.

Similar to CLGEMM::configure()

Returns
a status

Definition at line 94 of file CLGEMM.cpp.

95 {
96  return OperatorType::validate(a, b, c, output, alpha, beta, gemm_info);
97 }

References arm_compute::test::validation::b, arm_compute::test::validation::gemm_info, and ClGemm::validate().

Referenced by CLRNNLayer::validate(), CLGEMMDeconvolutionLayer::validate(), and CLLSTMLayer::validate().


The documentation for this class was generated from the following files:
arm_compute::ACL_SRC_0
@ ACL_SRC_0
Definition: Types.h:45
arm_compute::ACL_SRC_1
@ ACL_SRC_1
Definition: Types.h:46
arm_compute::CLKernelLibrary::get
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
Definition: CLKernelLibrary.cpp:39
arm_compute::ACL_SRC_2
@ ACL_SRC_2
Definition: Types.h:47
arm_compute::test::validation::m
const unsigned int m
Definition: GEMMMatrixMultiplyNative.cpp:359
ARM_COMPUTE_ERROR_ON_NULLPTR
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
arm_compute::experimental::MemoryInfo
Definition: Types.h:96
arm_compute::test::validation::gemm_info
gemm_info
Definition: GEMMMatrixMultiplyReshaped.cpp:862
arm_compute::ACL_DST
@ ACL_DST
Definition: Types.h:55
arm_compute::CLGEMM::configure
void configure(const CLCompileContext &compile_context, const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel's inputs and output.
Definition: CLGEMM.cpp:68
arm_compute::CLGEMM::prepare
void prepare() override
Prepare the function for executing.
Definition: CLGEMM.cpp:108
arm_compute::test::validation::b
SimpleTensor< float > b
Definition: DFT.cpp:157
arm_compute::mlgo::parser::end
void end(TokenStream &in, bool &valid)
Definition: MLGOParser.cpp:290
arm_compute::opencl::ClGemm::validate
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info)
Static function to check if given info will lead to a valid configuration.
Definition: ClGemm.cpp:612