Compute Library
 23.08
CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput > Class Template Referencefinal

This class is a wrapper for the assembly kernels. More...

#include <CpuGemmAssemblyWrapperKernel.h>

Collaboration diagram for CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >:
[legend]

Public Member Functions

 CpuGemmAssemblyWrapperKernel ()
 Constructor. More...
 
 CpuGemmAssemblyWrapperKernel (CpuGemmAssemblyWrapperKernel &)=delete
 
 CpuGemmAssemblyWrapperKernel (CpuGemmAssemblyWrapperKernel &&)=default
 
CpuGemmAssemblyWrapperKerneloperator= (CpuGemmAssemblyWrapperKernel &)=delete
 
const char * name () const override
 Name of the kernel. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator) override
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
void configure (arm_gemm::GemmCommon< TypeInput, TypeOutput > *kernel, std::string kernel_name_tag)
 Initialise the kernel's input and output. More...
 
size_t get_mws (const CPUInfo &platform, size_t thread_count) const override
 Return minimum workload size of the relevant kernel. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Additional Inherited Members

- Static Public Attributes inherited from ICPPKernel
static constexpr size_t default_mws = 1
 

Detailed Description

template<typename TypeInput, typename TypeOutput>
class arm_compute::cpu::kernel::CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >

This class is a wrapper for the assembly kernels.

Some kernels were written in assembly and highly optimised for specific CPUs like A53 or A55. This class works as a wrapper for these assembly kernels. The arm compute library creates an instance of CpuGemmAssemblyWrapperKernel and other auxiliary data structures to execute a single assembly kernel in the context of an NEFunctions.

The type T is the type of the actual kernel implemented in assembly which is of type template<typename To, typename Tr> class GemmCommon

Definition at line 55 of file CpuGemmAssemblyWrapperKernel.h.

Constructor & Destructor Documentation

◆ CpuGemmAssemblyWrapperKernel() [1/3]

Constructor.

Definition at line 60 of file CpuGemmAssemblyWrapperKernel.h.

61  : _kernel(nullptr), _name("CpuGemmAssemblyWrapperKernel")
62  {
63  }

◆ CpuGemmAssemblyWrapperKernel() [2/3]

CpuGemmAssemblyWrapperKernel ( CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput > &  )
delete

◆ CpuGemmAssemblyWrapperKernel() [3/3]

CpuGemmAssemblyWrapperKernel ( CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput > &&  )
default

Member Function Documentation

◆ configure()

void configure ( arm_gemm::GemmCommon< TypeInput, TypeOutput > *  kernel,
std::string  kernel_name_tag 
)
inline

Initialise the kernel's input and output.

Parameters
[in]kernelPointer to an assembly kernel implementation.
[in]kernel_name_tagTag to be attacehd to the kernel's name.

Definition at line 104 of file CpuGemmAssemblyWrapperKernel.h.

105  {
106  ARM_COMPUTE_ERROR_ON_NULLPTR((reinterpret_cast<void *>(kernel)));
107  _kernel = kernel;
108 
109  Window win = to_window(kernel->get_window_size());
110 
111  INEKernel::configure(win);
112 
113  if(!kernel_name_tag.empty())
114  {
115  _name += "/" + kernel_name_tag;
116  }
117  }

References ARM_COMPUTE_ERROR_ON_NULLPTR, IGemmCommon::get_window_size(), and arm_gemm::to_window().

◆ get_mws()

size_t get_mws ( const CPUInfo platform,
size_t  thread_count 
) const
inlineoverridevirtual

Return minimum workload size of the relevant kernel.

Parameters
[in]platformThe CPU platform used to create the context.
[in]thread_countNumber of threads in the execution.
Returns
[out] small_network_mws Minimum workload size for requested configuration.

Reimplemented from ICPPKernel.

Definition at line 125 of file CpuGemmAssemblyWrapperKernel.h.

126  {
127  ARM_COMPUTE_UNUSED(thread_count);
128  ARM_COMPUTE_UNUSED(platform);
129 
131  }

References ARM_COMPUTE_UNUSED, and ICPPKernel::default_mws.

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 69 of file CpuGemmAssemblyWrapperKernel.h.

70  {
71  return _name.c_str();
72  }

◆ operator=()

CpuGemmAssemblyWrapperKernel& operator= ( CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput > &  )
delete

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
inlineoverridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 74 of file CpuGemmAssemblyWrapperKernel.h.

75  {
76  ARM_COMPUTE_ERROR_ON_NULLPTR((reinterpret_cast<void *>(_kernel)));
78 
79  auto win = arm_gemm::to_ndcoord(window);
80 
81  arm_gemm::ndcoord_t thread_locator{};
82 
83  _kernel->execute(win, thread_locator, info.thread_id);
84  }

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, IGemmCommon::execute(), arm_compute::test::validation::info, arm_gemm::to_ndcoord(), and IKernel::window().

◆ run_nd()

void run_nd ( const Window window,
const ThreadInfo info,
const Window thread_locator 
)
inlineoverridevirtual

legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version

Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.
[in]thread_locatorSpecifies "where" the current thread is in the multi-dimensional space

Reimplemented from ICPPKernel.

Definition at line 87 of file CpuGemmAssemblyWrapperKernel.h.

88  {
89  ARM_COMPUTE_ERROR_ON_NULLPTR((reinterpret_cast<void *>(_kernel)));
91 
92  //convert between arm_compute and arm_gemm types
93  auto ndc_win = arm_gemm::to_ndcoord(window);
94  auto ndc_tlc = arm_gemm::to_ndcoord(thread_locator);
95 
96  _kernel->execute(ndc_win, ndc_tlc, info.thread_id);
97  }

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, IGemmCommon::execute(), arm_compute::test::validation::info, arm_gemm::to_ndcoord(), and IKernel::window().


The documentation for this class was generated from the following file:
arm_gemm::IGemmCommon::execute
virtual void execute(const ndcoord_t &work_range, const ndcoord_t &thread_locator, int threadid)=0
Main execute member fucntion.
arm_gemm::to_ndcoord
ndcoord_t to_ndcoord(const arm_compute::Window &win)
Convert an arm_compute::Window to an arm_gemm::NDCoord of the same max dimensions.
Definition: arm_gemm_compute_iface.hpp:117
arm_gemm::NDCoordinate
NDCoordinate builds upon a range, but specifies a starting position in addition to a size which it in...
Definition: ndrange.hpp:151
ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:1004
arm_gemm::to_window
arm_compute::Window to_window(const ndrange_t &ndr)
Definition: arm_gemm_compute_iface.hpp:55
ARM_COMPUTE_ERROR_ON_NULLPTR
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
ARM_COMPUTE_UNUSED
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
arm_compute::IKernel::window
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
arm_compute::test::validation::info
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
arm_gemm::IGemmCommon::get_window_size
virtual ndrange_t get_window_size() const =0
arm_compute::ICPPKernel::default_mws
static constexpr size_t default_mws
Definition: ICPPKernel.h:41