21.02
|
Interface for the Neon kernel to perform Winograd input transform. More...
#include <NEWinogradConvolutionLayerKernel.h>
Public Member Functions | |
virtual unsigned int | get_working_space_size (unsigned int num_threads) const =0 |
Get the working space required to perform the transformation. More... | |
virtual unsigned int | get_input_storage_size (int num_batches, int num_channels, int num_rows, int num_cols, bool same_padding) const =0 |
Determine how much memory (in units of TIn) to allocate for the transformed input. More... | |
virtual int | get_matrix_stride (int num_batches, int num_channels, int num_rows, int num_cols, bool same_padding) const =0 |
Gets the stride between matrices in the input worspace. More... | |
virtual void | configure (const ITensor *input_nhwc, const int num_batches, const int num_rows, const int num_cols, const int num_channels, const PaddingType padding, ITensor *output, const int matrix_stride, ITensor *workspace)=0 |
Configure the output transform kernel. More... | |
virtual | ~INEWinogradLayerTransformInputKernel () |
Destructor. More... | |
Public Member Functions inherited from ICPPKernel | |
virtual | ~ICPPKernel ()=default |
Default destructor. More... | |
virtual void | run (const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
virtual void | run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator) |
legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More... | |
virtual void | run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
virtual const char * | name () const =0 |
Name of the kernel. More... | |
Public Member Functions inherited from IKernel | |
IKernel () | |
Constructor. More... | |
virtual | ~IKernel ()=default |
Destructor. More... | |
virtual bool | is_parallelisable () const |
Indicates whether or not the kernel is parallelisable. More... | |
virtual BorderSize | border_size () const |
The size of the border for that kernel. More... | |
const Window & | window () const |
The maximum window the kernel can be executed on. More... | |
Interface for the Neon kernel to perform Winograd input transform.
Definition at line 39 of file NEWinogradConvolutionLayerKernel.h.
|
inlinevirtual |
|
pure virtual |
Configure the output transform kernel.
[in] | input_nhwc | Input tensor in NHWC data layout format. |
[in] | num_batches | Number of batches in input tensor. |
[in] | num_rows | Number of rows in input tensor. |
[in] | num_cols | Number of columns in input tensor. |
[in] | num_channels | Number of channels in input tensor. |
[in] | padding | Padding type. |
[out] | output | Base of output matrices. |
[in] | matrix_stride | Stride between output matrices. |
[in] | workspace | Tensor to be used as the working space during the computation. |
Implemented in NEWinogradLayerTransformInputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.
Referenced by NEWinogradLayerTransformInputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >::name(), NEWinogradLayerTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >::name(), NEWinogradLayerTransformWeightsKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >::name(), and INEWinogradLayerTransformWeightsKernel::~INEWinogradLayerTransformWeightsKernel().
|
pure virtual |
Determine how much memory (in units of TIn) to allocate for the transformed input.
[in] | num_batches | Number of batches in the input tensor. |
[in] | num_channels | Number of feature maps in the input tensor. |
[in] | num_rows | Number of rows in each feature map. |
[in] | num_cols | Number of columns in each feature map. |
[in] | same_padding | Use "SAME" padding, otherwise use "VALID". |
Implemented in NEWinogradLayerTransformInputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.
|
pure virtual |
Gets the stride between matrices in the input worspace.
[in] | num_batches | Number of batches in the input tensor. |
[in] | num_channels | Number of feature maps in the input tensor. |
[in] | num_rows | Number of rows in each feature map. |
[in] | num_cols | Number of columns in each feature map. |
[in] | same_padding | Use "SAME" padding, otherwise use "VALID". |
Implemented in NEWinogradLayerTransformInputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.
Referenced by NEWinogradLayerTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >::name(), NEWinogradLayerTransformWeightsKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >::name(), and INEWinogradLayerTransformWeightsKernel::~INEWinogradLayerTransformWeightsKernel().
|
pure virtual |
Get the working space required to perform the transformation.
Note, the working space is only required when performing the transformation - hence it can be reused whenever the transformation is not running.
num_threads | The greatest number of threads that will be used to execute the transform. |
Implemented in NEWinogradLayerTransformInputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.
Referenced by NEWinogradLayerTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >::name().