23.08
|
Basic function to compute the convolution layer. More...
#include <NEGEMMConvolutionLayer.h>
Public Member Functions | |
NEGEMMConvolutionLayer (const std::shared_ptr< IMemoryManager > &memory_manager=nullptr, IWeightsManager *weights_manager=nullptr) | |
Constructor. More... | |
NEGEMMConvolutionLayer (const NEGEMMConvolutionLayer &)=delete | |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
NEGEMMConvolutionLayer (NEGEMMConvolutionLayer &&)=delete | |
Prevent instances of this class from being moved (As this class contains non movable objects) More... | |
NEGEMMConvolutionLayer & | operator= (const NEGEMMConvolutionLayer &)=delete |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
NEGEMMConvolutionLayer & | operator= (NEGEMMConvolutionLayer &&)=delete |
Prevent instances of this class from being moved (As this class contains non movable objects) More... | |
~NEGEMMConvolutionLayer () | |
Default destructor. More... | |
void | configure (const ITensor *input, const ITensor *weights, const ITensor *biases, ITensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1) |
Set the input and output tensors. More... | |
void | run () override |
Run the kernels contained in the function. More... | |
void | prepare () override |
Prepare the function for executing. More... | |
![]() | |
virtual | ~IFunction ()=default |
Destructor. More... | |
Static Public Member Functions | |
static Status | validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1) |
Static function to check if given info will lead to a valid configuration of NEGEMMConvolutionLayer. More... | |
static Status | has_opt_impl (arm_compute::WeightFormat &expected_weight_format, const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false) |
Static function to check if there is an optimized version of GEMM available for the input parameters. More... | |
Basic function to compute the convolution layer.
This function calls the following kernels/functions:
Definition at line 48 of file NEGEMMConvolutionLayer.h.
NEGEMMConvolutionLayer | ( | const std::shared_ptr< IMemoryManager > & | memory_manager = nullptr , |
IWeightsManager * | weights_manager = nullptr |
||
) |
Constructor.
Definition at line 49 of file NEGEMMConvolutionLayer.cpp.
|
delete |
Prevent instances of this class from being copied (As this class contains pointers)
|
delete |
Prevent instances of this class from being moved (As this class contains non movable objects)
|
default |
Default destructor.
void configure | ( | const ITensor * | input, |
const ITensor * | weights, | ||
const ITensor * | biases, | ||
ITensor * | output, | ||
const PadStrideInfo & | conv_info, | ||
const WeightsInfo & | weights_info = WeightsInfo() , |
||
const Size2D & | dilation = Size2D(1U, 1U) , |
||
const ActivationLayerInfo & | act_info = ActivationLayerInfo() , |
||
bool | enable_fast_math = false , |
||
unsigned int | num_groups = 1 |
||
) |
Set the input and output tensors.
Valid data layouts:
Valid data type configurations:
src0 | src1 | src2 | dst |
---|---|---|---|
F16 | F16 | F16 | F16 |
F32 | F32 | F32 | F32 |
BFLOAT16 | BFLOAT16 | BFLOAT16 | BFLOAT16 |
QASYMM8 | QASYMM8 | S32 | QASYMM8 |
QASYMM8 | QSYMM8_PER_CHANNEL | S32 | QASYMM8 |
QASYMM8_SIGNED | QASYMM8_SIGNED | S32 | QASYMM8_SIGNED |
QASYMM8_SIGNED | QSYMM8_PER_CHANNEL | S32 | QASYMM8_SIGNED |
[in] | input | Source tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/BFLOAT16/F16/F32. |
[in] | weights | Weights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL/BFLOAT16/F16/F32. |
[in] | biases | Biases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match input data type, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type. |
[out] | output | Destination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input . |
[in] | conv_info | Contains padding and stride information described in PadStrideInfo. |
[in] | weights_info | Specifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with cpu::kernels::CpuGemmTranspose1xWKernel. Data type supported: Same as input . |
[in] | dilation | (Optional) Dilation, in elements, across x and y. Defaults to (1, 1). |
[in] | act_info | (Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU supported. |
[in] | enable_fast_math | (Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false |
[in] | num_groups | (Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported |
Definition at line 57 of file NEGEMMConvolutionLayer.cpp.
References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, arm_compute::test::validation::act_info, ARM_COMPUTE_ERROR_ON_NULLPTR, arm_compute::test::validation::conv_info, ITensor::info(), arm_compute::test::validation::input, arm_compute::test::validation::num_groups, and arm_compute::test::validation::weights_info.
|
static |
Static function to check if there is an optimized version of GEMM available for the input parameters.
The method is intended to be used to find out the optimal memory layout to be used for the weights tensor when running variable weights execution.
The user can query the database of optimised kernels in arm_gemm by specifying one of the enumerations of arm_compute::WeightFormat in the weight_format field of the input parameter weights_info. In case of success, the method writes the expected format in the output parameter expected_weight_format. The expected_weight_format can than be used in the configure method of the class for retrieving the best optimal kernel.
Use case one - query for a specific format:
WeightInfo weights_info(..., arm_compute::WeightFormat::OHWIo4, ...); // Set the value of the input query. if (NEGEMMConvolutionlayer::has_opt_impl(WeightFormat(), ...., weights_info, ...)) { auto conv = std::unique_ptr<NEGEMMConvolutionlayer>(); conv->configure(..., weights_info, ...); // uses the same WeightFormat the user wanted originally, OHWYo4. conv->run(...); }
Use case two - query for any format that would be optimal for the GEMM to execute:
WeightInfo weights_info(..., arm_compute::WeightFormat::ANY, ...); // Set the value of the input query. arm_compute::WeightFormat expected_wf; if (NEGEMMConvolutionlayer::has_opt_impl(expected_wf, ...., weights_info, ...)) { auto conv = std::unique_ptr<NEGEMMConvolutionlayer>(); // ... code to convert the layout of the weights tensor to the layout returned by has_opt_impl WeightInfo new_weights_info(..., expected_wf, ...); // Set the value of the WeightFormat returned by has_opt_impl. conv->configure(..., new_weights_info, ...); conv->run(...); }
Notice that a GEMM configured with a WeightFormat other than UNSPECIFIED will run GEMM with variable weights mode.
[out] | expected_weight_format | The arm_compute::WeightFormat expected by the kernel. |
[in] | src | Source tensor info. |
[in] | weights | Weights tensor info. |
[in] | biases | Biases tensor info. Shared biases supported. |
[in] | dst | Destination tensor info. |
[in] | conv_info | Contains padding and stride information described in PadStrideInfo. |
[in] | weights_info | (optional) Specifies additional configuration parameters for the weights of the GEMM computation. |
[in] | dilation | (Optional) Dilation, in elements, across x and y. Defaults to (1, 1). |
[in] | act_info | (Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU supported. And no activation (i.e. Linear) which is the default value. |
[in] | enable_fast_math | (Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation |
Definition at line 83 of file NEGEMMConvolutionLayer.cpp.
References arm_compute::test::validation::act_info, arm_compute::test::validation::conv_info, arm_compute::test::validation::dst, CpuGemmConv2d::has_opt_impl(), arm_compute::test::validation::src, and arm_compute::test::validation::weights_info.
|
delete |
Prevent instances of this class from being copied (As this class contains pointers)
|
delete |
Prevent instances of this class from being moved (As this class contains non movable objects)
|
overridevirtual |
Prepare the function for executing.
Any one off pre-processing step required by the function is handled here
Reimplemented from IFunction.
Definition at line 97 of file NEGEMMConvolutionLayer.cpp.
Referenced by NEGEMMConvolutionLayer::run().
|
overridevirtual |
Run the kernels contained in the function.
For CPU kernels:
For OpenCL kernels:
Implements IFunction.
Definition at line 90 of file NEGEMMConvolutionLayer.cpp.
References NEGEMMConvolutionLayer::prepare().
|
static |
Static function to check if given info will lead to a valid configuration of NEGEMMConvolutionLayer.
[in] | input | Source tensor info. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/BFLOAT16/F16/F32. |
[in] | weights | Weights tensor info. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL/BFLOAT16/F16/F32. |
[in] | biases | Biases tensor info. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match input data type, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type. |
[in] | output | Destination tensor info. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input . |
[in] | conv_info | Contains padding and stride information described in PadStrideInfo. |
[in] | weights_info | Specifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with cpu::kernels::CpuGemmTranspose1xWKernel. Data type supported: Same as input . |
[in] | dilation | (Optional) Dilation, in elements, across x and y. Defaults to (1, 1). |
[in] | act_info | (Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU supported. |
[in] | enable_fast_math | (Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false |
[in] | num_groups | (Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported |
Definition at line 77 of file NEGEMMConvolutionLayer.cpp.
References arm_compute::test::validation::act_info, arm_compute::test::validation::conv_info, arm_compute::test::validation::input, arm_compute::test::validation::num_groups, CpuGemmConv2d::validate(), and arm_compute::test::validation::weights_info.