Compute Library
 21.08
CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput > Class Template Referencefinal

This class is a wrapper for the assembly kernels. More...

#include <CpuGemmAssemblyWrapperKernel.h>

Collaboration diagram for CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >:
[legend]

Public Member Functions

 CpuGemmAssemblyWrapperKernel ()
 Constructor. More...
 
 CpuGemmAssemblyWrapperKernel (CpuGemmAssemblyWrapperKernel &)=delete
 
 CpuGemmAssemblyWrapperKernel (CpuGemmAssemblyWrapperKernel &&)=default
 
CpuGemmAssemblyWrapperKerneloperator= (CpuGemmAssemblyWrapperKernel &)=delete
 
const char * name () const override
 Name of the kernel. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator) override
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
void configure (arm_gemm::GemmCommon< TypeInput, TypeOutput > *kernel, std::string kernel_name_tag)
 Initialise the kernel's input and output. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Detailed Description

template<typename TypeInput, typename TypeOutput>
class arm_compute::cpu::kernel::CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >

This class is a wrapper for the assembly kernels.

Some kernels were written in assembly and highly optimised for specific CPUs like A53 or A55. This class works as a wrapper for these assembly kernels. The arm compute library creates an instance of CpuGemmAssemblyWrapperKernel and other auxiliary data structures to execute a single assembly kernel in the context of an NEFunctions.

The type T is the type of the actual kernel implemented in assembly which is of type template<typename To, typename Tr> class GemmCommon

Definition at line 55 of file CpuGemmAssemblyWrapperKernel.h.

Constructor & Destructor Documentation

◆ CpuGemmAssemblyWrapperKernel() [1/3]

Constructor.

Definition at line 60 of file CpuGemmAssemblyWrapperKernel.h.

References CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput >::operator=().

61  : _kernel(nullptr), _name("CpuGemmAssemblyWrapperKernel")
62  {
63  }

◆ CpuGemmAssemblyWrapperKernel() [2/3]

CpuGemmAssemblyWrapperKernel ( CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput > &  )
delete

◆ CpuGemmAssemblyWrapperKernel() [3/3]

CpuGemmAssemblyWrapperKernel ( CpuGemmAssemblyWrapperKernel< TypeInput, TypeOutput > &&  )
default

Member Function Documentation

◆ configure()

void configure ( arm_gemm::GemmCommon< TypeInput, TypeOutput > *  kernel,
std::string  kernel_name_tag 
)
inline

Initialise the kernel's input and output.

Parameters
[in]kernelPointer to an assembly kernel implementation.
[in]kernel_name_tagTag to be attacehd to the kernel's name.

Definition at line 104 of file CpuGemmAssemblyWrapperKernel.h.

References ARM_COMPUTE_ERROR_ON_NULLPTR, IGemmCommon::get_window_size(), and arm_gemm::to_window().

105  {
106  ARM_COMPUTE_ERROR_ON_NULLPTR((reinterpret_cast<void *>(kernel)));
107  _kernel = kernel;
108 
109  Window win = to_window(kernel->get_window_size());
110 
111  INEKernel::configure(win);
112 
113  if(!kernel_name_tag.empty())
114  {
115  _name += "/" + kernel_name_tag;
116  }
117  }
arm_compute::Window to_window(const ndrange_t &ndr)
virtual ndrange_t get_window_size() const =0
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 69 of file CpuGemmAssemblyWrapperKernel.h.

70  {
71  return _name.c_str();
72  }

◆ operator=()

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
inlineoverridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 74 of file CpuGemmAssemblyWrapperKernel.h.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, IGemmCommon::execute(), ThreadInfo::thread_id, and arm_gemm::to_ndcoord().

75  {
76  ARM_COMPUTE_ERROR_ON_NULLPTR((reinterpret_cast<void *>(_kernel)));
78 
79  auto win = arm_gemm::to_ndcoord(window);
80 
81  arm_gemm::ndcoord_t thread_locator{};
82 
83  _kernel->execute(win, thread_locator, info.thread_id);
84  }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
virtual void execute(const ndcoord_t &work_range, const ndcoord_t &thread_locator, int threadid)=0
Main execute member fucntion.
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:915
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
NDCoordinate builds upon a range, but specifies a starting position in addition to a size which it in...
Definition: ndrange.hpp:151
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
ndcoord_t to_ndcoord(const arm_compute::Window &win)
Convert an arm_compute::Window to an arm_gemm::NDCoord of the same max dimensions.

◆ run_nd()

void run_nd ( const Window window,
const ThreadInfo info,
const Window thread_locator 
)
inlineoverridevirtual

legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version

Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.
[in]thread_locatorSpecifies "where" the current thread is in the multi-dimensional space

Reimplemented from ICPPKernel.

Definition at line 87 of file CpuGemmAssemblyWrapperKernel.h.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, IGemmCommon::execute(), ThreadInfo::thread_id, and arm_gemm::to_ndcoord().

88  {
89  ARM_COMPUTE_ERROR_ON_NULLPTR((reinterpret_cast<void *>(_kernel)));
91 
92  //convert between arm_compute and arm_gemm types
93  auto ndc_win = arm_gemm::to_ndcoord(window);
94  auto ndc_tlc = arm_gemm::to_ndcoord(thread_locator);
95 
96  _kernel->execute(ndc_win, ndc_tlc, info.thread_id);
97  }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
virtual void execute(const ndcoord_t &work_range, const ndcoord_t &thread_locator, int threadid)=0
Main execute member fucntion.
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:915
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
ndcoord_t to_ndcoord(const arm_compute::Window &win)
Convert an arm_compute::Window to an arm_gemm::NDCoord of the same max dimensions.

The documentation for this class was generated from the following file: