23.08
|
This class is a wrapper for the assembly kernels. More...
#include <CpuGemmAssemblyWrapperKernel.h>
Public Member Functions | |
CpuGemmAssemblyWrapperKernel () | |
Constructor. More... | |
CpuGemmAssemblyWrapperKernel (CpuGemmAssemblyWrapperKernel &)=delete | |
CpuGemmAssemblyWrapperKernel (CpuGemmAssemblyWrapperKernel &&)=default | |
CpuGemmAssemblyWrapperKernel & | operator= (CpuGemmAssemblyWrapperKernel &)=delete |
const char * | name () const override |
Name of the kernel. More... | |
void | run (const Window &window, const ThreadInfo &info) override |
Execute the kernel on the passed window. More... | |
void | run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator) override |
legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More... | |
void | configure (arm_gemm::GemmCommon< TypeInput, TypeOutput > *kernel, std::string kernel_name_tag) |
Initialise the kernel's input and output. More... | |
size_t | get_mws (const CPUInfo &platform, size_t thread_count) const override |
Return minimum workload size of the relevant kernel. More... | |
![]() | |
virtual | ~ICPPKernel ()=default |
Default destructor. More... | |
virtual void | run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
![]() | |
IKernel () | |
Constructor. More... | |
virtual | ~IKernel ()=default |
Destructor. More... | |
virtual bool | is_parallelisable () const |
Indicates whether or not the kernel is parallelisable. More... | |
virtual BorderSize | border_size () const |
The size of the border for that kernel. More... | |
const Window & | window () const |
The maximum window the kernel can be executed on. More... | |
bool | is_window_configured () const |
Function to check if the embedded window of this kernel has been configured. More... | |
Additional Inherited Members | |
![]() | |
static constexpr size_t | default_mws = 1 |
This class is a wrapper for the assembly kernels.
Some kernels were written in assembly and highly optimised for specific CPUs like A53 or A55. This class works as a wrapper for these assembly kernels. The arm compute library creates an instance of CpuGemmAssemblyWrapperKernel and other auxiliary data structures to execute a single assembly kernel in the context of an NEFunctions.
The type T is the type of the actual kernel implemented in assembly which is of type template<typename To, typename Tr> class GemmCommon
Definition at line 55 of file CpuGemmAssemblyWrapperKernel.h.
|
inline |
Constructor.
Definition at line 60 of file CpuGemmAssemblyWrapperKernel.h.
|
delete |
|
default |
|
inline |
Initialise the kernel's input and output.
[in] | kernel | Pointer to an assembly kernel implementation. |
[in] | kernel_name_tag | Tag to be attacehd to the kernel's name. |
Definition at line 104 of file CpuGemmAssemblyWrapperKernel.h.
References ARM_COMPUTE_ERROR_ON_NULLPTR, IGemmCommon::get_window_size(), and arm_gemm::to_window().
|
inlineoverridevirtual |
Return minimum workload size of the relevant kernel.
[in] | platform | The CPU platform used to create the context. |
[in] | thread_count | Number of threads in the execution. |
Reimplemented from ICPPKernel.
Definition at line 125 of file CpuGemmAssemblyWrapperKernel.h.
References ARM_COMPUTE_UNUSED, and ICPPKernel::default_mws.
|
inlineoverridevirtual |
Name of the kernel.
Implements ICPPKernel.
Definition at line 69 of file CpuGemmAssemblyWrapperKernel.h.
|
delete |
|
inlineoverridevirtual |
Execute the kernel on the passed window.
[in] | window | Region on which to execute the kernel. (Must be a region of the window returned by window()) |
[in] | info | Info about executing thread and CPU. |
Reimplemented from ICPPKernel.
Definition at line 74 of file CpuGemmAssemblyWrapperKernel.h.
References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, IGemmCommon::execute(), arm_compute::test::validation::info, arm_gemm::to_ndcoord(), and IKernel::window().
|
inlineoverridevirtual |
legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version
[in] | window | Region on which to execute the kernel. (Must be a region of the window returned by window()) |
[in] | info | Info about executing thread and CPU. |
[in] | thread_locator | Specifies "where" the current thread is in the multi-dimensional space |
Reimplemented from ICPPKernel.
Definition at line 87 of file CpuGemmAssemblyWrapperKernel.h.
References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, IGemmCommon::execute(), arm_compute::test::validation::info, arm_gemm::to_ndcoord(), and IKernel::window().