Compute Library
 21.08
CLGEMMLowpMatrixMultiplyCore Class Reference

Basic function to execute GEMMLowpMatrixMultiplyCore on OpenCL. More...

#include <CLGEMMLowpMatrixMultiplyCore.h>

Collaboration diagram for CLGEMMLowpMatrixMultiplyCore:
[legend]

Public Member Functions

 CLGEMMLowpMatrixMultiplyCore (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
 CLGEMMLowpMatrixMultiplyCore (const CLGEMMLowpMatrixMultiplyCore &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLGEMMLowpMatrixMultiplyCore (CLGEMMLowpMatrixMultiplyCore &&)=default
 Default move constructor. More...
 
CLGEMMLowpMatrixMultiplyCoreoperator= (const CLGEMMLowpMatrixMultiplyCore &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLGEMMLowpMatrixMultiplyCoreoperator= (CLGEMMLowpMatrixMultiplyCore &&)=default
 Default move assignment operator. More...
 
 ~CLGEMMLowpMatrixMultiplyCore ()
 Default destructor. More...
 
void configure (const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, const GEMMInfo &gemm_info=GEMMInfo())
 Initialise the kernel's inputs, output. More...
 
void configure (const CLCompileContext &compile_context, const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, const GEMMInfo &gemm_info=GEMMInfo())
 Initialise the kernel's inputs, output. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, const GEMMInfo &gemm_info=GEMMInfo())
 Static function to check if given info will lead to a valid configuration of CLGEMMLowpMatrixMultiplyCore. More...
 

Detailed Description

Basic function to execute GEMMLowpMatrixMultiplyCore on OpenCL.

Definition at line 41 of file CLGEMMLowpMatrixMultiplyCore.h.

Constructor & Destructor Documentation

◆ CLGEMMLowpMatrixMultiplyCore() [1/3]

CLGEMMLowpMatrixMultiplyCore ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 58 of file CLGEMMLowpMatrixMultiplyCore.cpp.

References CLGEMMLowpMatrixMultiplyCore::~CLGEMMLowpMatrixMultiplyCore().

59  : _impl(std::make_unique<Impl>())
60 {
61  _impl->memory_group = MemoryGroup(memory_manager);
62 }

◆ CLGEMMLowpMatrixMultiplyCore() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLGEMMLowpMatrixMultiplyCore() [3/3]

Default move constructor.

◆ ~CLGEMMLowpMatrixMultiplyCore()

Member Function Documentation

◆ configure() [1/2]

void configure ( const ICLTensor a,
const ICLTensor b,
const ICLTensor c,
ICLTensor output,
const GEMMInfo gemm_info = GEMMInfo() 
)

Initialise the kernel's inputs, output.

Valid data layouts:

  • NHWC
  • NCHW

Valid data type configurations:

src0 src1 src2 dst
QASYMM8 QASYMM8 S32 QASYMM8
QASYMM8 QSYMM8_PER_CHANNEL S32 QASYMM8
QASYMM8 QSYMM8 S32 QASYMM8
QASYMM8 QASYMM8 S32 S32
QASYMM8 QSYMM8_PER_CHANNEL S32 S32
QASYMM8 QSYMM8 S32 S32
QASYMM8_SIGNED QASYMM8_SIGNED S32 QASYMM8_SIGNED
QASYMM8_SIGNED QSYMM8_PER_CHANNEL S32 QASYMM8_SIGNED
QASYMM8_SIGNED QSYMM8 S32 QASYMM8_SIGNED
QASYMM8_SIGNED QASYMM8_SIGNED S32 S32
QASYMM8_SIGNED QSYMM8_PER_CHANNEL S32 S32
QASYMM8_SIGNED QSYMM8 S32 S32
Note
GEMMLowp: low precision GEMM kernel. [A * B + C] This kernel performs the following computations:
  1. Convert a values from 8-bit quantized to int32 and add a_offset to each of them.
  2. Convert b values from 8-bit quantized to int32 and add b_offset to each of them.
  3. Compute the matrix product of the resulting a * b in int32.
  4. Quantize to uint8 if gemm_info.gemmlowp_output_stage != NONE
Parameters
[in]aFirst input tensor (Matrix A). Data type supported: QASYMM8/QASYMM8_SIGNED.
[in]bSecond input tensor (Matrix B). Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8/QSYMM8_PER_CHANNEL
[in]cThird input tensor (Matrix C). It can be a nullptr. Data type supported: S32
[out]outputOutput tensor. Data type supported: S32 or QASYMM8/QASYMM8_SIGNED if gemm_info.gemmlowp_output_stage != NONE
[in]gemm_info(Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should be executed only for the first run

Definition at line 66 of file CLGEMMLowpMatrixMultiplyCore.cpp.

References CLKernelLibrary::get().

Referenced by CLQLSTMLayer::CLQLSTMLayer(), CLGEMMDeconvolutionLayer::configure(), and CLLSTMLayerQuantized::configure().

67 {
68  configure(CLKernelLibrary::get().get_compile_context(), a, b, c, output, gemm_info);
69 }
SimpleTensor< float > b
Definition: DFT.cpp:157
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void configure(const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel&#39;s inputs, output.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
const ICLTensor a,
const ICLTensor b,
const ICLTensor c,
ICLTensor output,
const GEMMInfo gemm_info = GEMMInfo() 
)

Initialise the kernel's inputs, output.

Note
GEMMLowp: low precision GEMM kernel. [A * B + C] This kernel performs the following computations:
  1. Convert a values from 8-bit quantized to int32 and add a_offset to each of them.
  2. Convert b values from 8-bit quantized to int32 and add b_offset to each of them.
  3. Compute the matrix product of the resulting a * b in int32.
  4. Quantize to uint8 if gemm_info.gemmlowp_output_stage != NONE
Parameters
[in]compile_contextThe compile context to be used.
[in]aFirst input tensor (Matrix A). Data type supported: QASYMM8/QASYMM8_SIGNED.
[in]bSecond input tensor (Matrix B). Data type supported: same as a
[in]cThird input tensor (Matrix C). It can be a nullptr. Data type supported: S32
[out]outputOutput tensor. Data type supported: S32 or QASYMM8/QASYMM8_SIGNED if gemm_info.gemmlowp_output_stage != NONE
[in]gemm_info(Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should be executed only for the first run

Definition at line 71 of file CLGEMMLowpMatrixMultiplyCore.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, ARM_COMPUTE_ERROR_ON_NULLPTR, arm_compute::test::validation::b, ITensor::info(), and GEMMInfo::retain_internal_weights().

72 {
73  ARM_COMPUTE_ERROR_ON_NULLPTR(a, b, output);
74 
75  _impl->b = b;
76  _impl->op = std::make_unique<OperatorType>();
77  _impl->is_prepared = gemm_info.retain_internal_weights();
78 
79  _impl->op->configure(compile_context, a->info(), b->info(), c != nullptr ? c->info() : nullptr, output->info(), gemm_info);
80  _impl->aux_mem_req = _impl->op->workspace();
81 
82  // Manage/allocate auxilairy tensors
83  if(_impl->is_prepared)
84  {
85  _impl->run_pack.add_const_tensor(ACL_SRC_0, a);
86  _impl->run_pack.add_tensor(ACL_DST, output);
87  }
88  else
89  {
90  _impl->run_pack = { { ACL_SRC_0, a }, { ACL_SRC_1, _impl->b }, { ACL_SRC_2, c }, { ACL_DST, output } };
91  _impl->workspace_tensors = manage_workspace<CLTensor>(_impl->op->workspace(), _impl->memory_group, _impl->run_pack, _impl->run_pack);
92  }
93 }
SimpleTensor< float > b
Definition: DFT.cpp:157
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 109 of file CLGEMMLowpMatrixMultiplyCore.cpp.

References arm_compute::release_temporaries().

Referenced by CLGEMMDeconvolutionLayer::prepare(), and CLGEMMLowpMatrixMultiplyCore::run().

110 {
111  if(!_impl->is_prepared)
112  {
113  _impl->op->prepare(_impl->run_pack);
114 
115  // Release temporary tensors that are only used in prepare stage
116  release_temporaries(_impl->aux_mem_req, _impl->workspace_tensors);
117 
118  _impl->is_prepared = true;
119  }
120 }
void release_temporaries(const experimental::MemoryRequirements &mem_reqs, WorkspaceData< TensorType > &workspace)
Utility function to release tensors with lifetime marked as Prepare.

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For CPU kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 100 of file CLGEMMLowpMatrixMultiplyCore.cpp.

References CLGEMMLowpMatrixMultiplyCore::prepare().

Referenced by CLGEMMDeconvolutionLayer::run(), CLLSTMLayerQuantized::run(), and CLQLSTMLayer::run().

101 {
102  prepare();
103 
104  MemoryGroupResourceScope scope_mg(_impl->memory_group);
105 
106  _impl->op->run(_impl->run_pack);
107 }
void prepare() override
Prepare the function for executing.

◆ validate()

Status validate ( const ITensorInfo a,
const ITensorInfo b,
const ITensorInfo c,
const ITensorInfo output,
const GEMMInfo gemm_info = GEMMInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CLGEMMLowpMatrixMultiplyCore.

Parameters
[in]aFirst input tensor info (Matrix A). Data type supported: QASYMM8.
[in]bSecond input tensor info (Matrix B). Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8/QSYMM8_PER_CHANNEL
[in]cThird input tensor info (Matrix C). It can be a nullptr. Data type supported: S32
[in]outputOutput tensor info. Data type supported: S32 or QASYMM8/QASYMM8_SIGNED if gemm_info.gemmlowp_output_stage != NONE
[in]gemm_info(Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should be executed only for the first run
Returns
a status

Definition at line 95 of file CLGEMMLowpMatrixMultiplyCore.cpp.

References ClGemm::validate().

Referenced by CLGEMMDeconvolutionLayer::validate(), and CLLSTMLayerQuantized::validate().

96 {
97  return OperatorType::validate(a, b, c, output, gemm_info);
98 }
SimpleTensor< float > b
Definition: DFT.cpp:157
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info)
Static function to check if given info will lead to a valid configuration.
Definition: ClGemm.cpp:612

The documentation for this class was generated from the following files: