24.02.1
|
Basic function to execute GEMMLowpMatrixMultiplyCore. More...
#include <CpuGemmLowpMatrixMultiplyCore.h>
Public Member Functions | |
CpuGemmLowpMatrixMultiplyCore () | |
Constructor. More... | |
ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (CpuGemmLowpMatrixMultiplyCore) | |
~CpuGemmLowpMatrixMultiplyCore () | |
Destructor. More... | |
void | configure (const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, ITensorInfo *dst, const GEMMInfo &gemm_info=GEMMInfo()) |
Initialise the kernel's inputs, output. More... | |
void | run (ITensorPack &tensors) override |
Run the kernels contained in the function. More... | |
void | prepare (ITensorPack &tensors) override |
Prepare the function for executing. More... | |
experimental::MemoryRequirements | workspace () const override |
Return the memory requirements required by the workspace. More... | |
Public Member Functions inherited from INEOperator | |
INEOperator (IRuntimeContext *ctx=nullptr) | |
Constructor. More... | |
INEOperator (const INEOperator &)=delete | |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
INEOperator (INEOperator &&)=default | |
Default move constructor. More... | |
INEOperator & | operator= (const INEOperator &)=delete |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
INEOperator & | operator= (INEOperator &&)=default |
Default move assignment operator. More... | |
~INEOperator () | |
Default destructor. More... | |
Public Member Functions inherited from IOperator | |
virtual | ~IOperator ()=default |
Destructor. More... | |
Static Public Member Functions | |
static Status | validate (const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *dst, const GEMMInfo &gemm_info=GEMMInfo()) |
Static function to check if given info will lead to a valid configuration. More... | |
Basic function to execute GEMMLowpMatrixMultiplyCore.
This function calls the following kernels if the DOT product instruction is not available:
otherwise if the DOT product instruction is available:
Definition at line 66 of file CpuGemmLowpMatrixMultiplyCore.h.
|
default |
Destructor.
ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE | ( | CpuGemmLowpMatrixMultiplyCore | ) |
void configure | ( | const ITensorInfo * | a, |
const ITensorInfo * | b, | ||
const ITensorInfo * | c, | ||
ITensorInfo * | dst, | ||
const GEMMInfo & | gemm_info = GEMMInfo() |
||
) |
Initialise the kernel's inputs, output.
Valid data layouts:
Valid data type configurations:
src0 | src1 | src2 | dst |
---|---|---|---|
QASYMM8 | QASYMM8 | S32 | QASYMM8 |
QASYMM8 | QSYMM8_PER_CHANNEL | S32 | QASYMM8 |
QASYMM8 | QSYMM8 | S32 | QASYMM8 |
QASYMM8 | QASYMM8 | S32 | S32 |
QASYMM8 | QSYMM8_PER_CHANNEL | S32 | S32 |
QASYMM8 | QSYMM8 | S32 | S32 |
QASYMM8_SIGNED | QASYMM8_SIGNED | S32 | QASYMM8_SIGNED |
QASYMM8_SIGNED | QSYMM8_PER_CHANNEL | S32 | QASYMM8_SIGNED |
QASYMM8_SIGNED | QSYMM8 | S32 | QASYMM8_SIGNED |
QASYMM8_SIGNED | QASYMM8_SIGNED | S32 | S32 |
QASYMM8_SIGNED | QSYMM8_PER_CHANNEL | S32 | S32 |
QASYMM8_SIGNED | QSYMM8 | S32 | S32 |
output
type is S32 if gemm_info.type
== GEMMLowpOutputStageType::NONE. It is QASYMM8/QASYMM8_SIGNED otherwise[in] | a | First input tensor info (Matrix A). Data type supported: QASYMM8/QASYMM8_SIGNED. |
[in] | b | Second input tensor info (Matrix B). Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8/QSYMM8_PER_CHANNEL. |
[in] | c | Third input tensor info (Matrix C). It can be a nullptr. Data type supported: S32 |
[out] | dst | Output tensor info. Data type supported: Data type supported: S32/QASYMM8/QASYMM8_SIGNED |
[in] | gemm_info | (Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should be executed only for the first run |
Definition at line 108 of file CpuGemmLowpMatrixMultiplyCore.cpp.
References GEMMInfo::activation_info(), ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, arm_compute::test::validation::b, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_interleaved_shape(), arm_compute::misc::shape_calculator::compute_reductionA_shape(), arm_compute::misc::shape_calculator::compute_reductionB_shape(), arm_compute::misc::shape_calculator::compute_transpose1xW_shape(), ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::test::validation::dst, dt, ActivationLayerInfo::enabled(), GEMMLowpOutputStageInfo::gemmlowp_max_bound, GEMMLowpOutputStageInfo::gemmlowp_min_bound, GEMMLowpOutputStageInfo::gemmlowp_offset, arm_compute::test::validation::info, CpuGemmAssemblyDispatch::is_activation_supported(), arm_compute::is_data_type_quantized_asymmetric(), arm_compute::is_data_type_quantized_per_channel(), arm_compute::NONE, UniformQuantizationInfo::offset, arm_compute::offset_int_vec(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, ITensorInfo::quantization_info(), TensorInfo::quantization_info(), arm_compute::QUANTIZE_DOWN_FIXEDPOINT, arm_compute::S32, arm_compute::S8, UniformQuantizationInfo::scale, TensorInfo::total_size(), arm_compute::U8, QuantizationInfo::uniform(), and CpuGemmLowpMatrixMultiplyCore::validate().
|
overridevirtual |
Prepare the function for executing.
Any one off pre-processing step required by the function is handled here
[in] | constants | Vector that contains the constants tensors. |
Reimplemented from INEOperator.
Definition at line 694 of file CpuGemmLowpMatrixMultiplyCore.cpp.
References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_1, Window::DimX, Window::DimY, Scheduler::get(), CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), arm_compute::offset_int_vec(), arm_compute::test::validation::pack, and IScheduler::schedule_op().
Referenced by CpuGemmLowpMatrixMultiplyCore::run().
|
overridevirtual |
Run the kernels contained in the function.
[in] | tensors | Vector that contains the tensors to operate on. |
Reimplemented from INEOperator.
Definition at line 552 of file CpuGemmLowpMatrixMultiplyCore.cpp.
References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, arm_compute::ACL_SRC_3, ITensorPack::add_const_tensor(), ITensorPack::add_tensor(), arm_compute::test::validation::b, Window::DimX, Window::DimY, arm_compute::test::validation::dst, GEMMInfo::gemmlowp_output_stage(), Scheduler::get(), CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), arm_compute::is_data_type_quantized_asymmetric(), arm_compute::offset_int_vec(), arm_compute::test::validation::pack, CpuGemmLowpMatrixMultiplyCore::prepare(), arm_compute::QUANTIZE_DOWN_FIXEDPOINT, IScheduler::schedule_op(), and GEMMLowpOutputStageInfo::type.
|
static |
Static function to check if given info will lead to a valid configuration.
Similar to CpuGemmLowpMatrixMultiplyCore::configure()
Definition at line 326 of file CpuGemmLowpMatrixMultiplyCore.cpp.
References GEMMInfo::activation_info(), ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::auto_init_if_empty(), arm_compute::test::validation::b, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_reductionA_shape(), arm_compute::misc::shape_calculator::compute_reductionB_shape(), ITensorInfo::data_type(), ITensorInfo::dimension(), dt, ActivationLayerInfo::enabled(), GEMMLowpOutputStageInfo::gemmlowp_max_bound, GEMMLowpOutputStageInfo::gemmlowp_min_bound, GEMMLowpOutputStageInfo::gemmlowp_offset, GEMMInfo::gemmlowp_output_stage(), arm_compute::test::validation::info, GEMMInfo::is_a_reshaped(), GEMMInfo::is_b_reshaped(), arm_compute::is_data_type_quantized_asymmetric(), arm_compute::is_data_type_quantized_per_channel(), arm_compute::NONE, UniformQuantizationInfo::offset, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8, arm_compute::QSYMM8_PER_CHANNEL, ITensorInfo::quantization_info(), arm_compute::QUANTIZE_DOWN_FIXEDPOINT, arm_compute::S32, UniformQuantizationInfo::scale, TensorShape::set(), ITensorInfo::tensor_shape(), GEMMLowpOutputStageInfo::type, QuantizationInfo::uniform(), CpuActivation::validate(), CpuConvertQuantizedSignednessKernel::validate(), CpuGemmLowpMatrixAReductionKernel::validate(), CpuGemmLowpMatrixMultiplyKernel::validate(), CpuGemmInterleave4x4Kernel::validate(), CpuGemmLowpOffsetContributionKernel::validate(), CpuGemmTranspose1xWKernel::validate(), CpuGemmLowpOffsetContributionOutputStageKernel::validate(), CpuGemmLowpMatrixBReductionKernel::validate(), and CpuGemmAssemblyDispatch::validate().
Referenced by CpuGemmLowpMatrixMultiplyCore::configure(), and NEGEMMLowpMatrixMultiplyCore::validate().
|
overridevirtual |
Return the memory requirements required by the workspace.
Reimplemented from INEOperator.
Definition at line 728 of file CpuGemmLowpMatrixMultiplyCore.cpp.