21.02
|
Neon kernel to perform Winograd input transform. More...
#include <NEWinogradConvolutionLayerKernel.h>
Public Types | |
using | WinogradBase = winograd::WinogradGEMM< OutputTileRows, OutputTileCols, KernelRows, KernelCols, winograd::WinogradRoots::Integers > |
Winograd base kernel. More... | |
using | WinogradConv = typename WinogradBase::template Convolution< T, T > |
Winograd convolution kernel. More... | |
Public Member Functions | |
NEWinogradLayerTransformInputKernel (const NEWinogradLayerTransformInputKernel &)=delete | |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
NEWinogradLayerTransformInputKernel & | operator= (const NEWinogradLayerTransformInputKernel &)=delete |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
NEWinogradLayerTransformInputKernel (NEWinogradLayerTransformInputKernel &&)=default | |
Allow instances of this class to be moved. More... | |
NEWinogradLayerTransformInputKernel & | operator= (NEWinogradLayerTransformInputKernel &&)=default |
Allow instances of this class to be moved. More... | |
~NEWinogradLayerTransformInputKernel ()=default | |
Default destructor. More... | |
unsigned int | get_input_storage_size (int num_batches, int num_channels, int num_rows, int num_cols, bool same_padding) const override |
Determine how much memory (in units of TIn) to allocate for the transformed input. More... | |
unsigned int | get_working_space_size (unsigned int num_threads) const override |
Get the working space required to perform the transformation. More... | |
int | get_matrix_stride (int num_batches, int num_channels, int num_rows, int num_cols, bool same_padding) const override |
Gets the stride between matrices in the input worspace. More... | |
NEWinogradLayerTransformInputKernel () | |
Default constructor. More... | |
const char * | name () const override |
Name of the kernel. More... | |
void | configure (const ITensor *input_nhwc, const int num_batches, const int num_rows, const int num_cols, const int num_channels, const PaddingType padding, ITensor *output, const int matrix_stride, ITensor *workspace) override |
Configure the output transform kernel. More... | |
void | run (const Window &window, const ThreadInfo &info) override |
Execute the kernel on the passed window. More... | |
Public Member Functions inherited from INEWinogradLayerTransformInputKernel | |
virtual | ~INEWinogradLayerTransformInputKernel () |
Destructor. More... | |
Public Member Functions inherited from ICPPKernel | |
virtual | ~ICPPKernel ()=default |
Default destructor. More... | |
virtual void | run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator) |
legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More... | |
virtual void | run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
Public Member Functions inherited from IKernel | |
IKernel () | |
Constructor. More... | |
virtual | ~IKernel ()=default |
Destructor. More... | |
virtual bool | is_parallelisable () const |
Indicates whether or not the kernel is parallelisable. More... | |
virtual BorderSize | border_size () const |
The size of the border for that kernel. More... | |
const Window & | window () const |
The maximum window the kernel can be executed on. More... | |
Static Public Member Functions | |
static Status | validate (const ITensorInfo *input, const ITensorInfo *output, const WinogradInfo &winograd_info) |
Static function to check if given info will lead to a valid configuration of NEWinogradLayerTransformInputKernel. More... | |
Neon kernel to perform Winograd input transform.
Definition at line 101 of file NEWinogradConvolutionLayerKernel.h.
using WinogradBase = winograd::WinogradGEMM<OutputTileRows, OutputTileCols, KernelRows, KernelCols, winograd::WinogradRoots::Integers> |
Winograd base kernel.
Definition at line 197 of file NEWinogradConvolutionLayerKernel.h.
using WinogradConv = typename WinogradBase::template Convolution<T, T> |
Winograd convolution kernel.
Definition at line 199 of file NEWinogradConvolutionLayerKernel.h.
|
delete |
Prevent instances of this class from being copied (As this class contains pointers)
|
default |
Allow instances of this class to be moved.
|
default |
Default destructor.
Default constructor.
Definition at line 318 of file NEWinogradConvolutionLayerKernel.cpp.
|
overridevirtual |
Configure the output transform kernel.
[in] | input_nhwc | Input tensor. Data types supported: F16/F32. Layout supported NHWC. |
[in] | num_batches | Number of batches in input tensor. |
[in] | num_rows | Number of rows in input tensor. |
[in] | num_cols | Number of columns in input tensor. |
[in] | num_channels | Number of channels in input tensor. |
[in] | padding | Padding type. |
[out] | output | Base of output matrices. |
[in] | matrix_stride | Stride between output matrices. |
[in] | workspace | Tensor to be used as the working space during the computation. |
< Padding to apply to the top of the image.
< Padding to apply to the left of the image.
< Padding to apply to the bottom of the image.
< Padding to apply to the right of the image.
Implements INEWinogradLayerTransformInputKernel.
Definition at line 325 of file NEWinogradConvolutionLayerKernel.cpp.
References Window::DimX, and arm_gemm::iceildiv().
|
overridevirtual |
Determine how much memory (in units of TIn) to allocate for the transformed input.
[in] | num_batches | Number of batches in the input tensor. |
[in] | num_channels | Number of feature maps in the input tensor. |
[in] | num_rows | Number of rows in each feature map. |
[in] | num_cols | Number of columns in each feature map. |
[in] | same_padding | Use "SAME" padding, otherwise use "VALID". |
Implements INEWinogradLayerTransformInputKernel.
Definition at line 285 of file NEWinogradConvolutionLayerKernel.cpp.
References arm_compute::test::validation::input_shape.
|
overridevirtual |
Gets the stride between matrices in the input worspace.
[in] | num_batches | Number of batches in the input tensor. |
[in] | num_channels | Number of feature maps in the input tensor. |
[in] | num_rows | Number of rows in each feature map. |
[in] | num_cols | Number of columns in each feature map. |
[in] | same_padding | Use "SAME" padding, otherwise use "VALID". |
Implements INEWinogradLayerTransformInputKernel.
Definition at line 307 of file NEWinogradConvolutionLayerKernel.cpp.
|
overridevirtual |
Get the working space required to perform the transformation.
Note, the working space is only required when performing the transformation - hence it can be reused whenever the transformation is not running.
[in] | num_threads | The greatest number of threads that will be used to execute the transform. |
Implements INEWinogradLayerTransformInputKernel.
Definition at line 301 of file NEWinogradConvolutionLayerKernel.cpp.
|
inlineoverridevirtual |
Name of the kernel.
Implements ICPPKernel.
Definition at line 165 of file NEWinogradConvolutionLayerKernel.h.
References INEWinogradLayerTransformInputKernel::configure(), arm_compute::test::validation::info, ICPPKernel::run(), and IKernel::window().
|
delete |
Prevent instances of this class from being copied (As this class contains pointers)
|
default |
Allow instances of this class to be moved.
|
overridevirtual |
Execute the kernel on the passed window.
[in] | window | Region on which to execute the kernel. (Must be a region of the window returned by window()) |
[in] | info | Info about executing thread and CPU. |
Reimplemented from ICPPKernel.
Definition at line 371 of file NEWinogradConvolutionLayerKernel.cpp.
References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensor::buffer(), ITensorInfo::element_size(), Window::Dimension::end(), ITensor::info(), ITensorInfo::offset_first_element_in_bytes(), Window::Dimension::start(), ITensorInfo::strides_in_bytes(), ThreadInfo::thread_id, Window::x(), Dimensions< T >::y(), and Dimensions< T >::z().
|
static |
Static function to check if given info will lead to a valid configuration of NEWinogradLayerTransformInputKernel.
[in] | input | First tensor input info. Data types supported: F16/F32. |
[in] | output | Output tensor info. Data types supported: same as input . |
[in] | winograd_info | Contains Winograd's information described in WinogradInfo |
Definition at line 397 of file NEWinogradConvolutionLayerKernel.cpp.
References ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::test::validation::winograd_info.