Compute Library
 23.11
CpuWinogradConv2dTransformOutputKernel Class Reference

#include <CpuWinogradConv2dKernel.h>

Collaboration diagram for CpuWinogradConv2dTransformOutputKernel:
[legend]

Public Member Functions

 CpuWinogradConv2dTransformOutputKernel (const CpuWinogradConv2dTransformOutputKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CpuWinogradConv2dTransformOutputKerneloperator= (const CpuWinogradConv2dTransformOutputKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CpuWinogradConv2dTransformOutputKernel (CpuWinogradConv2dTransformOutputKernel &&)=delete
 Prevent instances of this class from being moved it contains references. More...
 
CpuWinogradConv2dTransformOutputKerneloperator= (CpuWinogradConv2dTransformOutputKernel &&)=delete
 Prevent instances of this class from being moved it contains references. More...
 
 CpuWinogradConv2dTransformOutputKernel (arm_conv::winograd::WinogradImpl &w_impl, arm_conv::ConvolutionArgs &_c_args, uint32_t nthreads)
 
void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
const char * name () const override
 Name of the kernel. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run (const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual size_t get_mws (const CPUInfo &platform, size_t thread_count) const
 Return minimum workload size of the relevant kernel. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from ICpuKernel< CpuWinogradConv2dTransformOutputKernel >
static const auto * get_implementation (const SelectorType &selector, KernelSelectionType selection_type=KernelSelectionType::Supported)
 Micro-kernel selector. More...
 
- Static Public Attributes inherited from ICPPKernel
static constexpr size_t default_mws = 1
 

Detailed Description

Definition at line 74 of file CpuWinogradConv2dKernel.h.

Constructor & Destructor Documentation

◆ CpuWinogradConv2dTransformOutputKernel() [1/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CpuWinogradConv2dTransformOutputKernel() [2/3]

Prevent instances of this class from being moved it contains references.

◆ CpuWinogradConv2dTransformOutputKernel() [3/3]

CpuWinogradConv2dTransformOutputKernel ( arm_conv::winograd::WinogradImpl &  w_impl,
arm_conv::ConvolutionArgs &  _c_args,
uint32_t  nthreads 
)

Definition at line 64 of file CpuWinogradConv2dKernel.cpp.

67  : _winograd_impl{w_impl}, _conv_args{_c_args}, _nthreads{nthreads}
68 {
69 }

Member Function Documentation

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 96 of file CpuWinogradConv2dKernel.h.

97  {
98  return "CpuWinogradConv2dTransformOutputKernel";
99  }

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Prevent instances of this class from being moved it contains references.

◆ run_op()

void run_op ( ITensorPack tensors,
const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]tensorsA vector containing the tensors to operate on.
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 72 of file CpuWinogradConv2dKernel.cpp.

73 {
75  const ITensor *dst_nhwc = tensors.get_const_tensor(TensorType::ACL_DST);
76  const ITensor *winograd_output_transform = tensors.get_const_tensor(TensorType::ACL_SRC_0);
77  const ITensor *biases = tensors.get_const_tensor(TensorType::ACL_SRC_1);
78  const ITensor *workspace = tensors.get_tensor(TensorType::ACL_INT);
79 
80  const unsigned int width_idx = 1;
81  const unsigned int height_idx = 2;
82  const unsigned int batch_idx = 3;
83  const int element_size_in_bytes = dst_nhwc->info()->element_size();
84  const auto dst_strides = dst_nhwc->info()->strides_in_bytes();
85 
86  const size_t out_row_stride = dst_strides[height_idx] / element_size_in_bytes;
87  const size_t out_col_stride = dst_strides[width_idx] / element_size_in_bytes;
88  const size_t out_batch_stride = dst_strides[batch_idx] / element_size_in_bytes;
89  const auto wout_transf_ptr = reinterpret_cast<const void *>(
90  winograd_output_transform->buffer() + winograd_output_transform->info()->offset_first_element_in_bytes());
91  auto dst_nhwc_ptr =
92  reinterpret_cast<void *>(dst_nhwc->buffer() + dst_nhwc->info()->offset_first_element_in_bytes());
93  void *biases_data_ptr = nullptr;
94  if (biases != nullptr)
95  {
96  biases_data_ptr = reinterpret_cast<void *>(biases->buffer() + biases->info()->offset_first_element_in_bytes());
97  }
98 
99  // Output transform
100  _winograd_impl.output_transform->execute(_conv_args, wout_transf_ptr, _winograd_impl.winograd_spec, biases_data_ptr,
101  dst_nhwc_ptr, out_batch_stride, out_row_stride, out_col_stride,
102  workspace->buffer(), info.thread_id, _nthreads);
103 }

References arm_compute::ACL_DST, arm_compute::ACL_INT, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, ARM_COMPUTE_UNUSED, ITensor::buffer(), ITensorInfo::element_size(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), arm_compute::cpu::height_idx, ITensor::info(), arm_compute::test::validation::info, ITensorInfo::offset_first_element_in_bytes(), ITensorInfo::strides_in_bytes(), arm_compute::cpu::width_idx, IKernel::window(), and arm_compute::test::validation::reference::winograd_output_transform().


The documentation for this class was generated from the following files:
arm_compute::ACL_SRC_0
@ ACL_SRC_0
Definition: Types.h:45
arm_compute::ACL_SRC_1
@ ACL_SRC_1
Definition: Types.h:46
arm_compute::test::validation::reference::winograd_output_transform
SimpleTensor< T > winograd_output_transform(const SimpleTensor< T > &in, const SimpleTensor< T > &b, const TensorShape &output_shape, const WinogradInfo &winograd_info)
Definition: Winograd.cpp:440
arm_compute::ACL_DST
@ ACL_DST
Definition: Types.h:55
ARM_COMPUTE_UNUSED
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:151
arm_compute::IKernel::window
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
arm_compute::ACL_INT
@ ACL_INT
Definition: Types.h:62
arm_compute::cpu::width_idx
const size_t width_idx
Definition: impl.h:37
arm_compute::test::validation::info
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
arm_compute::cpu::height_idx
const size_t height_idx
Definition: impl.h:38