Compute Library
 22.11
CpuWinogradConv2dTransformOutputKernel Class Reference

#include <CpuWinogradConv2dKernel.h>

Collaboration diagram for CpuWinogradConv2dTransformOutputKernel:
[legend]

Public Member Functions

 CpuWinogradConv2dTransformOutputKernel (const CpuWinogradConv2dTransformOutputKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CpuWinogradConv2dTransformOutputKerneloperator= (const CpuWinogradConv2dTransformOutputKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CpuWinogradConv2dTransformOutputKernel (CpuWinogradConv2dTransformOutputKernel &&)=delete
 Prevent instances of this class from being moved it contains references. More...
 
CpuWinogradConv2dTransformOutputKerneloperator= (CpuWinogradConv2dTransformOutputKernel &&)=delete
 Prevent instances of this class from being moved it contains references. More...
 
 CpuWinogradConv2dTransformOutputKernel (arm_conv::winograd::WinogradImpl &w_impl, arm_conv::ConvolutionArgs &_c_args, uint32_t nthreads)
 
void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
const char * name () const override
 Name of the kernel. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run (const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual size_t get_mws (const CPUInfo &platform, size_t thread_count) const
 Return minimum workload size of the relevant kernel. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from ICpuKernel< CpuWinogradConv2dTransformOutputKernel >
static const auto * get_implementation (const SelectorType &selector, KernelSelectionType selection_type=KernelSelectionType::Supported)
 Micro-kernel selector. More...
 
- Static Public Attributes inherited from ICPPKernel
static constexpr size_t default_mws = 1
 

Detailed Description

Definition at line 71 of file CpuWinogradConv2dKernel.h.

Constructor & Destructor Documentation

◆ CpuWinogradConv2dTransformOutputKernel() [1/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CpuWinogradConv2dTransformOutputKernel() [2/3]

Prevent instances of this class from being moved it contains references.

◆ CpuWinogradConv2dTransformOutputKernel() [3/3]

CpuWinogradConv2dTransformOutputKernel ( arm_conv::winograd::WinogradImpl &  w_impl,
arm_conv::ConvolutionArgs &  _c_args,
uint32_t  nthreads 
)

Definition at line 68 of file CpuWinogradConv2dKernel.cpp.

69  : _winograd_impl{ w_impl }, _conv_args{ _c_args }, _nthreads{ nthreads }
70 {
71 }

Member Function Documentation

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 91 of file CpuWinogradConv2dKernel.h.

92  {
93  return "CpuWinogradConv2dTransformOutputKernel";
94  }

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Prevent instances of this class from being moved it contains references.

◆ run_op()

void run_op ( ITensorPack tensors,
const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]tensorsA vector containing the tensors to operate on.
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 74 of file CpuWinogradConv2dKernel.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_INT, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, ARM_COMPUTE_UNUSED, ITensor::buffer(), ITensorInfo::element_size(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), ITensor::info(), ITensorInfo::offset_first_element_in_bytes(), ITensorInfo::strides_in_bytes(), ThreadInfo::thread_id, and arm_compute::test::validation::reference::winograd_output_transform().

75 {
77  const ITensor *dst_nhwc = tensors.get_const_tensor(TensorType::ACL_DST);
78  const ITensor *winograd_output_transform = tensors.get_const_tensor(TensorType::ACL_SRC_0);
79  const ITensor *biases = tensors.get_const_tensor(TensorType::ACL_SRC_1);
80  const ITensor *workspace = tensors.get_tensor(TensorType::ACL_INT);
81 
82  const unsigned int width_idx = 1;
83  const unsigned int height_idx = 2;
84  const unsigned int batch_idx = 3;
85  const int element_size_in_bytes = dst_nhwc->info()->element_size();
86  const auto dst_strides = dst_nhwc->info()->strides_in_bytes();
87 
88  const size_t out_row_stride = dst_strides[height_idx] / element_size_in_bytes;
89  const size_t out_col_stride = dst_strides[width_idx] / element_size_in_bytes;
90  const size_t out_batch_stride = dst_strides[batch_idx] / element_size_in_bytes;
91  const auto wout_transf_ptr = reinterpret_cast<const void *>(winograd_output_transform->buffer() + winograd_output_transform->info()->offset_first_element_in_bytes());
92  auto dst_nhwc_ptr = reinterpret_cast<void *>(dst_nhwc->buffer() + dst_nhwc->info()->offset_first_element_in_bytes());
93  void *biases_data_ptr = nullptr;
94  if(biases != nullptr)
95  {
96  biases_data_ptr = reinterpret_cast<void *>(biases->buffer() + biases->info()->offset_first_element_in_bytes());
97  }
98 
99  // Output transform
100  _winograd_impl.output_transform->execute(
101  _conv_args,
102  wout_transf_ptr,
103  _winograd_impl.winograd_spec,
104  biases_data_ptr,
105  dst_nhwc_ptr,
106  out_batch_stride,
107  out_row_stride,
108  out_col_stride,
109  workspace->buffer(),
110  info.thread_id,
111  _nthreads);
112 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
SimpleTensor< T > winograd_output_transform(const SimpleTensor< T > &in, const SimpleTensor< T > &b, const TensorShape &output_shape, const WinogradInfo &winograd_info)
Definition: Winograd.cpp:440
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)

The documentation for this class was generated from the following files: