21.08
|
This class is a wrapper for the assembly kernels. More...
#include <CpuPool2dAssemblyWrapperKernel.h>
Public Member Functions | |
CpuPool2dAssemblyWrapperKernel ()=default | |
Constructor. More... | |
ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (CpuPool2dAssemblyWrapperKernel) | |
const char * | name () const override |
Name of the kernel. More... | |
void | configure (const ITensorInfo *src, ITensorInfo *dst, const PoolingLayerInfo &info, const CPUInfo &cpu_info) |
Initialise the kernel's src and dst. More... | |
void | run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) override |
Execute the kernel on the passed window. More... | |
size_t | get_working_size (unsigned int num_threads) const |
Get size of the workspace needed by the assembly kernel. More... | |
bool | is_configured () const |
Was the asm kernel successfully configured? More... | |
![]() | |
virtual | ~ICPPKernel ()=default |
Default destructor. More... | |
virtual void | run (const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
virtual void | run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator) |
legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More... | |
![]() | |
IKernel () | |
Constructor. More... | |
virtual | ~IKernel ()=default |
Destructor. More... | |
virtual bool | is_parallelisable () const |
Indicates whether or not the kernel is parallelisable. More... | |
virtual BorderSize | border_size () const |
The size of the border for that kernel. More... | |
const Window & | window () const |
The maximum window the kernel can be executed on. More... | |
bool | is_window_configured () const |
Function to check if the embedded window of this kernel has been configured. More... | |
Static Public Member Functions | |
static Status | validate (const ITensorInfo *src, const ITensorInfo *dst, const PoolingLayerInfo &info) |
Static function to check if given info will lead to a valid configuration. More... | |
This class is a wrapper for the assembly kernels.
Some kernels were written in assembly and highly optimised for specific CPUs like A53 or A55. The arm compute library creates an instance of CpuPool2dAssemblyWrapperKernel and other auxiliary data structures to execute a single assembly kernel in the context of an NEFunction.
Definition at line 48 of file CpuPool2dAssemblyWrapperKernel.h.
|
default |
Constructor.
ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE | ( | CpuPool2dAssemblyWrapperKernel | ) |
void configure | ( | const ITensorInfo * | src, |
ITensorInfo * | dst, | ||
const PoolingLayerInfo & | info, | ||
const CPUInfo & | cpu_info | ||
) |
Initialise the kernel's src and dst.
[in] | src | Source tensor info. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. |
[out] | dst | Destination tensor info to store the result of pooling. Data types supported: same as src . |
[in] | info | Pooling meta-data. |
[in] | cpu_info | CPU information needed to select the most appropriate kernel. |
Definition at line 44 of file CpuPool2dAssemblyWrapperKernel.cpp.
References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_UNUSED, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window(), ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_pool_shape(), ITensorInfo::data_type(), arm_compute::test::validation::dst, arm_compute::F16, arm_compute::F32, arm_compute::test::validation::info, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, ITensorInfo::quantization_info(), and arm_compute::test::validation::src.
Referenced by CpuPool2dAssemblyWrapperKernel::name().
size_t get_working_size | ( | unsigned int | num_threads | ) | const |
Get size of the workspace needed by the assembly kernel.
[in] | num_threads | Maximum number of threads that are going to be spawned. |
Definition at line 176 of file CpuPool2dAssemblyWrapperKernel.cpp.
Referenced by CpuPool2dAssemblyWrapperKernel::name().
bool is_configured | ( | ) | const |
Was the asm kernel successfully configured?
Definition at line 181 of file CpuPool2dAssemblyWrapperKernel.cpp.
References GemmTuner::args, arm_compute::AVG, arm_compute::quantization::calculate_quantized_multiplier(), ITensorInfo::dimension(), arm_compute::test::validation::dst, PoolingLayerInfo::exclude_padding, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, arm_compute::test::validation::info, MAX, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PoolingLayerInfo::pad_stride_info, PadStrideInfo::pad_top(), PoolingLayerInfo::pool_size, PoolingLayerInfo::pool_type, ITensorInfo::quantization_info(), UniformQuantizationInfo::scale, arm_compute::test::validation::src, PadStrideInfo::stride(), QuantizationInfo::uniform(), Size2D::x(), and Size2D::y().
Referenced by CpuPool2dAssemblyWrapperKernel::name().
|
inlineoverridevirtual |
Name of the kernel.
Implements ICPPKernel.
Definition at line 56 of file CpuPool2dAssemblyWrapperKernel.h.
References CpuPool2dAssemblyWrapperKernel::configure(), arm_compute::test::validation::dst, CpuPool2dAssemblyWrapperKernel::get_working_size(), arm_compute::test::validation::info, CpuPool2dAssemblyWrapperKernel::is_configured(), CpuPool2dAssemblyWrapperKernel::run_op(), arm_compute::test::validation::src, CpuPool2dAssemblyWrapperKernel::validate(), and IKernel::window().
|
overridevirtual |
Execute the kernel on the passed window.
[in] | tensors | A vector containing the tensors to operate on. |
[in] | window | Region on which to execute the kernel. (Must be a region of the window returned by window()) |
[in] | info | Info about executing thread and CPU. |
Reimplemented from ICPPKernel.
Definition at line 142 of file CpuPool2dAssemblyWrapperKernel.cpp.
References arm_compute::ACL_DST, arm_compute::ACL_INT_0, arm_compute::ACL_SRC, ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensor::buffer(), arm_compute::test::validation::dst, arm_compute::test::validation::dst_shape, ITensorPack::empty(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), ITensor::info(), BorderSize::left, ThreadInfo::num_threads, ITensorInfo::offset_first_element_in_bytes(), ITensorInfo::padding(), arm_compute::test::validation::src, ITensorInfo::tensor_shape(), and ThreadInfo::thread_id.
Referenced by CpuPool2dAssemblyWrapperKernel::name().
|
static |
Static function to check if given info will lead to a valid configuration.
Similar to CpuPool2dAssemblyWrapperKernel::configure()
Definition at line 94 of file CpuPool2dAssemblyWrapperKernel.cpp.
References ARM_COMPUTE_RETURN_ERROR_MSG, ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_CPU_F16_UNSUPPORTED, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, arm_compute::AVG, arm_compute::quantization::calculate_quantized_multiplier(), ITensorInfo::data_layout(), PoolingLayerInfo::data_layout, ITensorInfo::data_type(), PoolingLayerInfo::exclude_padding, arm_compute::F16, arm_compute::F32, PadStrideInfo::has_padding(), arm_compute::MAX, arm_compute::NHWC, PoolingLayerInfo::pad_stride_info, PoolingLayerInfo::pool_type, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, ITensorInfo::quantization_info(), UniformQuantizationInfo::scale, ITensorInfo::total_size(), and QuantizationInfo::uniform().
Referenced by CpuPool2d::configure(), CpuPool2dAssemblyWrapperKernel::name(), and CpuPool2d::validate().