24.02.1
|
Common interface for all kernels implemented in C++. More...
#include <ICPPKernel.h>
Public Member Functions | |
virtual | ~ICPPKernel ()=default |
Default destructor. More... | |
virtual void | run (const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
virtual void | run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator) |
legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More... | |
virtual void | run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
virtual size_t | get_mws (const CPUInfo &platform, size_t thread_count) const |
Return minimum workload size of the relevant kernel. More... | |
virtual const char * | name () const =0 |
Name of the kernel. More... | |
Public Member Functions inherited from IKernel | |
IKernel () | |
Constructor. More... | |
virtual | ~IKernel ()=default |
Destructor. More... | |
virtual bool | is_parallelisable () const |
Indicates whether or not the kernel is parallelisable. More... | |
virtual BorderSize | border_size () const |
The size of the border for that kernel. More... | |
const Window & | window () const |
The maximum window the kernel can be executed on. More... | |
bool | is_window_configured () const |
Function to check if the embedded window of this kernel has been configured. More... | |
Static Public Attributes | |
static constexpr size_t | default_mws = 1 |
Common interface for all kernels implemented in C++.
Definition at line 38 of file ICPPKernel.h.
|
virtualdefault |
Default destructor.
|
inlinevirtual |
Return minimum workload size of the relevant kernel.
[in] | platform | The CPU platform used to create the context. |
[in] | thread_count | Number of threads in the execution. |
Reimplemented in CpuDivisionKernel, CpuDepthwiseConv2dAssemblyWrapperKernel, CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >, CpuIm2ColKernel, CpuArithmeticKernel, CpuMulKernel, NEPadLayerKernel, CpuAddKernel, CpuSubKernel, CpuReshapeKernel, and CpuActivationKernel.
Definition at line 100 of file ICPPKernel.h.
References ARM_COMPUTE_UNUSED, and ICPPKernel::default_mws.
|
pure virtual |
Name of the kernel.
Implemented in CpuComplexMulKernel, CpuGemmLowpMatrixBReductionKernel, CpuGemmLowpOffsetContributionOutputStageKernel, CpuIm2ColKernel, CpuWinogradConv2dTransformOutputKernel, CpuAddMulAddKernel, CpuMulKernel, CpuGemmMatrixMultiplyKernel, CpuGemmTranspose1xWKernel, CpuScaleKernel, CpuGemmLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel, CpuGemmLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel, CpuDepthwiseConv2dAssemblyWrapperKernel, CpuDepthwiseConv2dNativeKernel, CpuGemmLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel, CpuMaxUnpoolingLayerKernel, CpuDirectConv3dKernel, CpuGemmLowpOffsetContributionKernel, CpuWeightsReshapeKernel, CpuAddKernel, CpuGemmLowpQuantizeDownInt32ScaleKernel, CpuCastKernel, CpuCol2ImKernel, CpuGemmMatrixAdditionKernel, CpuActivationKernel, CpuDirectConv2dOutputStageKernel, CpuSubKernel, CpuDirectConv2dKernel, CpuGemmInterleave4x4Kernel, CpuPool3dKernel, CpuConvertFullyConnectedWeightsKernel, CpuElementwiseUnaryKernel, CpuGemmLowpMatrixMultiplyKernel, CpuPool2dKernel, CpuGemmLowpMatrixAReductionKernel, CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >, CpuConcatenateDepthKernel, CpuFloorKernel, CpuSoftmaxKernel, NELogicalKernel, CpuWinogradConv2dTransformInputKernel, CpuQuantizeKernel, CpuConcatenateHeightKernel, CpuConcatenateWidthKernel, CpuConcatenateBatchKernel, CpuPermuteKernel, CpuCopyKernel, CpuReshapeKernel, NEGatherKernel, CpuConvertQuantizedSignednessKernel, CpuDequantizeKernel, CpuTransposeKernel, CpuPool2dAssemblyWrapperKernel, CpuElementwiseKernel< Derived >, NECol2ImKernel, CpuFillKernel, NETileKernel, NESelectKernel, NEFFTRadixStageKernel, NERangeKernel, NEReductionOperationKernel, NEStackLayerKernel, NEStridedSliceKernel, NEBatchNormalizationLayerKernel, NEBitwiseAndKernel, NEBitwiseNotKernel, NEBitwiseOrKernel, NEBitwiseXorKernel, NEFillBorderKernel, NEMeanStdDevNormalizationKernel, CPPNonMaximumSuppressionKernel, CPPPermuteKernel, NECropKernel, NEFFTDigitReverseKernel, NEFFTScaleKernel, NESpaceToBatchLayerKernel, CPPUpsampleKernel, NEBatchToSpaceLayerKernel, NEPadLayerKernel, NEQLSTMLayerNormalizationKernel, NESpaceToDepthLayerKernel, CPPBoxWithNonMaximaSuppressionLimitKernel, NEChannelShuffleLayerKernel, NEDepthToSpaceLayerKernel, NEFuseBatchNormalizationKernel, NEInstanceNormalizationLayerKernel, NENormalizationLayerKernel, NEReorgLayerKernel, NEROIAlignLayerKernel, CPPTopKVKernel, NEBoundingBoxTransformKernel, NEL2NormalizeLayerKernel, NEPriorBoxLayerKernel, NEReverseKernel, NEComputeAllAnchorsKernel, and NEROIPoolingLayerKernel.
|
inlinevirtual |
Execute the kernel on the passed window.
[in] | window | Region on which to execute the kernel. (Must be a region of the window returned by window()) |
[in] | info | Info about executing thread and CPU. |
Reimplemented in NESpaceToBatchLayerKernel, NEFuseBatchNormalizationKernel, NEBatchNormalizationLayerKernel, NEBatchToSpaceLayerKernel, NECropKernel, NECol2ImKernel, CPPNonMaximumSuppressionKernel, NEROIAlignLayerKernel, NEPadLayerKernel, NEFillBorderKernel, NEBoundingBoxTransformKernel, NEStackLayerKernel, NEFFTRadixStageKernel, NEReductionOperationKernel, NESelectKernel, NEGatherKernel, NENormalizationLayerKernel, NEL2NormalizeLayerKernel, NEFFTDigitReverseKernel, NEMeanStdDevNormalizationKernel, NERangeKernel, CPPBoxWithNonMaximaSuppressionLimitKernel, CPPTopKVKernel, NEPriorBoxLayerKernel, NEQLSTMLayerNormalizationKernel, CPPPermuteKernel, NEDepthToSpaceLayerKernel, NEInstanceNormalizationLayerKernel, NEReorgLayerKernel, NEReverseKernel, NEFFTScaleKernel, NEComputeAllAnchorsKernel, CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >, NESpaceToDepthLayerKernel, NEChannelShuffleLayerKernel, NETileKernel, NEROIPoolingLayerKernel, NEBitwiseAndKernel, NEBitwiseOrKernel, NEBitwiseXorKernel, CPPUpsampleKernel, and NEBitwiseNotKernel.
Definition at line 57 of file ICPPKernel.h.
References ARM_COMPUTE_ERROR, ARM_COMPUTE_UNUSED, arm_compute::test::validation::info, and IKernel::window().
Referenced by ICPPKernel::run_nd(), and SingleThreadScheduler::schedule().
|
inlinevirtual |
legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version
[in] | window | Region on which to execute the kernel. (Must be a region of the window returned by window()) |
[in] | info | Info about executing thread and CPU. |
[in] | thread_locator | Specifies "where" the current thread is in the multi-dimensional space |
Reimplemented in CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >.
Definition at line 70 of file ICPPKernel.h.
References ARM_COMPUTE_UNUSED, arm_compute::test::validation::info, ICPPKernel::run(), and IKernel::window().
|
inlinevirtual |
Execute the kernel on the passed window.
[in] | tensors | A vector containing the tensors to operate on. |
[in] | window | Region on which to execute the kernel. (Must be a region of the window returned by window()) |
[in] | info | Info about executing thread and CPU. |
Reimplemented in CpuComplexMulKernel, CpuGemmLowpMatrixBReductionKernel, CpuGemmLowpOffsetContributionOutputStageKernel, CpuIm2ColKernel, NEStridedSliceKernel, CpuWinogradConv2dTransformOutputKernel, CpuAddMulAddKernel, CpuMulKernel, CpuGemmMatrixMultiplyKernel, CpuGemmTranspose1xWKernel, CpuScaleKernel, CpuGemmLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel, CpuGemmLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel, CpuDepthwiseConv2dAssemblyWrapperKernel, NEFillBorderKernel, CpuDepthwiseConv2dNativeKernel, CpuGemmLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel, CpuDirectConv3dKernel, CpuGemmLowpOffsetContributionKernel, CpuWeightsReshapeKernel, CpuAddKernel, CpuGemmLowpQuantizeDownInt32ScaleKernel, CpuPool2dAssemblyWrapperKernel, CpuCastKernel, CpuCol2ImKernel, CpuGemmMatrixAdditionKernel, CpuActivationKernel, CpuDirectConv2dOutputStageKernel, CpuMaxUnpoolingLayerKernel, CpuSubKernel, CpuDirectConv2dKernel, CpuGemmInterleave4x4Kernel, CpuPool3dKernel, CpuConvertFullyConnectedWeightsKernel, CpuElementwiseUnaryKernel, CpuGemmLowpMatrixMultiplyKernel, CpuPool2dKernel, CpuGemmLowpMatrixAReductionKernel, CpuConcatenateDepthKernel, CpuFloorKernel, CpuSoftmaxKernel, NELogicalKernel, CpuQuantizeKernel, CpuWinogradConv2dTransformInputKernel, CpuConcatenateHeightKernel, CpuConcatenateWidthKernel, CpuConcatenateBatchKernel, CpuPermuteKernel, CpuCopyKernel, CpuReshapeKernel, CpuConvertQuantizedSignednessKernel, CpuDequantizeKernel, CpuTransposeKernel, CpuElementwiseKernel< Derived >, CpuElementwiseKernel< CpuComparisonKernel >, CpuElementwiseKernel< CpuArithmeticKernel >, and CpuFillKernel.
Definition at line 88 of file ICPPKernel.h.
References ARM_COMPUTE_UNUSED, arm_compute::test::validation::info, and IKernel::window().
Referenced by SingleThreadScheduler::schedule_op(), and OMPScheduler::schedule_op().
|
staticconstexpr |
Definition at line 41 of file ICPPKernel.h.
Referenced by CpuActivationKernel::get_mws(), CpuReshapeKernel::get_mws(), CpuSubKernel::get_mws(), CpuAddKernel::get_mws(), NEPadLayerKernel::get_mws(), ICPPKernel::get_mws(), CpuMulKernel::get_mws(), CpuArithmeticKernel::get_mws(), CpuIm2ColKernel::get_mws(), CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >::get_mws(), CpuDepthwiseConv2dAssemblyWrapperKernel::get_mws(), and CpuDivisionKernel::get_mws().