Compute Library
 21.02
GEMMKernelInfo Struct Reference

Descriptor used by the GEMM kernels. More...

#include <KernelDescriptors.h>

Collaboration diagram for GEMMKernelInfo:
[legend]

Public Member Functions

 GEMMKernelInfo ()=default
 
 GEMMKernelInfo (unsigned int im, unsigned int in, unsigned int ik, unsigned int idepth_output_gemm3d, bool ireinterpret_input_as_3d, bool ibroadcast_bias, bool ifp_mixed_precision, bool ihas_pad_y, ActivationLayerInfo iactivation_info, int inmult_transpose1xW_width, int imult_interleave4x4_height, GEMMLHSMatrixInfo ilhs_info, GEMMRHSMatrixInfo irhs_info, int32_t ina_offset, int32_t inb_offset)
 

Data Fields

unsigned int m { 0 }
 Number of LHS rows. More...
 
unsigned int n { 0 }
 Number of RHS columns. More...
 
unsigned int k { 0 }
 Number of LHS columns or RHS rows. More...
 
unsigned int depth_output_gemm3d { 0 }
 Depth of the output tensor in case is reinterpreted as 3D. More...
 
bool reinterpret_input_as_3d { false }
 Flag used to reinterpret the input as 3D. More...
 
bool broadcast_bias { false }
 Flag used to broadcast the bias addition. More...
 
bool fp_mixed_precision { false }
 Flag used to indicate wider accumulators (32 bit instead of 16 for FP16). More...
 
bool has_pad_y { false }
 Flag used to indicate if the input/output tensors have internal pad on the y direction. More...
 
ActivationLayerInfo activation_info {}
 Activation function to perform after the matrix multiplication. More...
 
int mult_transpose1xW_width { 1 }
 Multiplication factor for the width of the 1xW transposed block. More...
 
int mult_interleave4x4_height { 1 }
 Multiplication factor for the height of the 4x4 interleaved block. More...
 
GEMMLHSMatrixInfo lhs_info {}
 LHS matrix information used to retrieve the number of rows processed by each thread. More...
 
GEMMRHSMatrixInfo rhs_info {}
 RHS matrix information used for reshaping the RHS matrix. More...
 
int32_t a_offset { 0 }
 Offset to be added to each element of the matrix A. More...
 
int32_t b_offset { 0 }
 Offset to be added to each element of the matrix B. More...
 
GEMMLowpOutputStageInfo output_stage {}
 GEMMLowp output stage information. More...
 

Detailed Description

Descriptor used by the GEMM kernels.

Definition at line 56 of file KernelDescriptors.h.

Constructor & Destructor Documentation

◆ GEMMKernelInfo() [1/2]

GEMMKernelInfo ( )
default

◆ GEMMKernelInfo() [2/2]

GEMMKernelInfo ( unsigned int  im,
unsigned int  in,
unsigned int  ik,
unsigned int  idepth_output_gemm3d,
bool  ireinterpret_input_as_3d,
bool  ibroadcast_bias,
bool  ifp_mixed_precision,
bool  ihas_pad_y,
ActivationLayerInfo  iactivation_info,
int  inmult_transpose1xW_width,
int  imult_interleave4x4_height,
GEMMLHSMatrixInfo  ilhs_info,
GEMMRHSMatrixInfo  irhs_info,
int32_t  ina_offset,
int32_t  inb_offset 
)
inline

Definition at line 59 of file KernelDescriptors.h.

75  : m(im), n(in), k(ik), depth_output_gemm3d(idepth_output_gemm3d), reinterpret_input_as_3d(ireinterpret_input_as_3d), broadcast_bias(ibroadcast_bias), fp_mixed_precision(ifp_mixed_precision),
76  has_pad_y(ihas_pad_y), activation_info(iactivation_info), mult_transpose1xW_width(inmult_transpose1xW_width), mult_interleave4x4_height(imult_interleave4x4_height), lhs_info(ilhs_info),
77  rhs_info(irhs_info), a_offset(ina_offset), b_offset(inb_offset)
78  {
79  }
bool broadcast_bias
Flag used to broadcast the bias addition.
int mult_interleave4x4_height
Multiplication factor for the height of the 4x4 interleaved block.
bool fp_mixed_precision
Flag used to indicate wider accumulators (32 bit instead of 16 for FP16).
unsigned int depth_output_gemm3d
Depth of the output tensor in case is reinterpreted as 3D.
ActivationLayerInfo activation_info
Activation function to perform after the matrix multiplication.
GEMMLHSMatrixInfo lhs_info
LHS matrix information used to retrieve the number of rows processed by each thread.
unsigned int m
Number of LHS rows.
unsigned int n
Number of RHS columns.
int32_t b_offset
Offset to be added to each element of the matrix B.
bool reinterpret_input_as_3d
Flag used to reinterpret the input as 3D.
bool has_pad_y
Flag used to indicate if the input/output tensors have internal pad on the y direction.
int32_t a_offset
Offset to be added to each element of the matrix A.
GEMMRHSMatrixInfo rhs_info
RHS matrix information used for reshaping the RHS matrix.
unsigned int k
Number of LHS columns or RHS rows.
int mult_transpose1xW_width
Multiplication factor for the width of the 1xW transposed block.

Field Documentation

◆ a_offset

int32_t a_offset { 0 }

◆ activation_info

ActivationLayerInfo activation_info {}

Activation function to perform after the matrix multiplication.

Definition at line 89 of file KernelDescriptors.h.

◆ b_offset

int32_t b_offset { 0 }

◆ broadcast_bias

bool broadcast_bias { false }

Flag used to broadcast the bias addition.

Definition at line 86 of file KernelDescriptors.h.

Referenced by CLGEMMMatrixMultiplyNativeKernel::configure(), CLGEMMMatrixMultiplyReshapedOnlyRHSKernel::configure(), and arm_compute::operator<<().

◆ depth_output_gemm3d

◆ fp_mixed_precision

bool fp_mixed_precision { false }

Flag used to indicate wider accumulators (32 bit instead of 16 for FP16).

Definition at line 87 of file KernelDescriptors.h.

Referenced by arm_compute::operator<<().

◆ has_pad_y

bool has_pad_y { false }

Flag used to indicate if the input/output tensors have internal pad on the y direction.

Definition at line 88 of file KernelDescriptors.h.

Referenced by CLGEMMReshapeRHSMatrixKernelManaged::configure(), and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel::configure().

◆ k

◆ lhs_info

GEMMLHSMatrixInfo lhs_info {}

LHS matrix information used to retrieve the number of rows processed by each thread.

Definition at line 92 of file KernelDescriptors.h.

Referenced by CLGEMMReshapeRHSMatrixKernelManaged::configure(), CLGEMMLowpMatrixMultiplyCore::configure(), CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel::configure(), and CLGEMMLowpMatrixMultiplyCore::validate().

◆ m

◆ mult_interleave4x4_height

int mult_interleave4x4_height { 1 }

Multiplication factor for the height of the 4x4 interleaved block.

Definition at line 91 of file KernelDescriptors.h.

Referenced by arm_compute::operator<<().

◆ mult_transpose1xW_width

int mult_transpose1xW_width { 1 }

Multiplication factor for the width of the 1xW transposed block.

Definition at line 90 of file KernelDescriptors.h.

Referenced by arm_compute::operator<<().

◆ n

◆ output_stage

◆ reinterpret_input_as_3d

◆ rhs_info


The documentation for this struct was generated from the following file: