21.05
|
Interface for the kernel to perform Winograd output transform. More...
#include <NEWinogradConvolutionLayerKernel.h>
Public Member Functions | |
virtual unsigned int | get_working_space_size (unsigned int num_threads) const =0 |
Get the working space required to perform the transformation. More... | |
virtual unsigned int | get_output_storage_size (int num_batches, int num_rows, int num_cols, int num_output_channels) const =0 |
Determine how much memory (in units of TOut) to allocate for the (Winograd domain) output. More... | |
virtual int | get_matrix_stride (int num_batches, int num_rows, int num_cols, int num_output_channels) const =0 |
Gets the stride between matrices in the output worspace. More... | |
virtual std::pair< unsigned int, unsigned int > | get_output_shape (int num_rows, int num_cols, bool padding_same) const =0 |
Get the output shape of a convolution. More... | |
virtual void | configure (const ITensor *biases, const ITensor *transformed_output, const int matrix_stride, ITensor *output_nhwc, const int num_batches, const int num_rows, const int num_cols, const int num_channels, ITensor *workspace, const arm_gemm::Activation &activation)=0 |
Configure the output transform kernel. More... | |
virtual | ~INEWinogradLayerTransformOutputKernel () |
Public Member Functions inherited from ICPPKernel | |
virtual | ~ICPPKernel ()=default |
Default destructor. More... | |
virtual void | run (const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
virtual void | run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator) |
legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More... | |
virtual void | run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) |
Execute the kernel on the passed window. More... | |
virtual const char * | name () const =0 |
Name of the kernel. More... | |
Public Member Functions inherited from IKernel | |
IKernel () | |
Constructor. More... | |
virtual | ~IKernel ()=default |
Destructor. More... | |
virtual bool | is_parallelisable () const |
Indicates whether or not the kernel is parallelisable. More... | |
virtual BorderSize | border_size () const |
The size of the border for that kernel. More... | |
const Window & | window () const |
The maximum window the kernel can be executed on. More... | |
bool | is_window_configured () const |
Function to check if the embedded window of this kernel has been configured. More... | |
Interface for the kernel to perform Winograd output transform.
Definition at line 231 of file NEWinogradConvolutionLayerKernel.h.
|
inlinevirtual |
Definition at line 308 of file NEWinogradConvolutionLayerKernel.h.
|
pure virtual |
Configure the output transform kernel.
[in] | biases | Pointer to the biases tensor. |
[in] | transformed_output | Pointer to working space for the output tensor in the Winograd domain. |
[in] | matrix_stride | Output matrix stride, can be computed with winograd::WinogradGEMM<2, 2, 3, 3>::Convolution<float, float>::get_output_matrix_stride() |
[out] | output_nhwc | Pointer to a tensor in NHWC data layout ordered output tensor, in the spatial domain. |
[in] | num_batches | Number of batches in the input tensor. |
[in] | num_rows | Number of rows in output tensor. |
[in] | num_cols | Number of columns in output tensor. |
[in] | num_channels | Number of feature maps in the output tensor. |
[in] | workspace | Tensor to be used as the working space during the computation. |
[in] | activation | Activation to be used |
Implemented in NEWinogradLayerTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.
|
pure virtual |
Gets the stride between matrices in the output worspace.
[in] | num_batches | Number of batches in the output tensor. |
[in] | num_rows | Number of rows in each feature map of the input tensor. |
[in] | num_cols | Number of columns in each feature map of the input tensor. |
[in] | num_output_channels | Number of feature maps in the output tensor. |
Implemented in NEWinogradLayerTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.
|
pure virtual |
Get the output shape of a convolution.
[in] | num_rows | Number of rows in each feature map of the input tensor. |
[in] | num_cols | Number of columns in each feature map of the input tensor. |
[in] | padding_same | True if padding is SAME, false otherwise |
Implemented in NEWinogradLayerTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.
|
pure virtual |
Determine how much memory (in units of TOut) to allocate for the (Winograd domain) output.
[in] | num_batches | Number of batches in the output tensor. |
[in] | num_rows | Number of rows in each feature map of the input tensor. |
[in] | num_cols | Number of columns in each feature map of the input tensor. |
[in] | num_output_channels | Number of feature maps in the output tensor. |
Implemented in NEWinogradLayerTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.
|
pure virtual |
Get the working space required to perform the transformation.
Note, the working space is only required when performing the transformation - hence it can be reused whenever the transformation is not running.
[in] | num_threads | The greatest number of threads that will be used to execute the transform. |
Implemented in NEWinogradLayerTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.