Compute Library
 21.11
ICpuWinogradConv2dTransformOutputKernel Class Referenceabstract

Interface for the kernel to perform Winograd output transform. More...

#include <CpuWinogradConv2dKernel.h>

Collaboration diagram for ICpuWinogradConv2dTransformOutputKernel:
[legend]

Public Member Functions

virtual unsigned int get_working_space_size (unsigned int num_threads) const =0
 Get the working space required to perform the transformation. More...
 
virtual unsigned int get_output_storage_size (int num_batches, int num_rows, int num_cols, int num_output_channels) const =0
 Determine how much memory (in units of TOut) to allocate for the (Winograd domain) output. More...
 
virtual int get_matrix_stride (int num_batches, int num_rows, int num_cols, int num_output_channels) const =0
 Gets the stride between matrices in the output worspace. More...
 
virtual std::pair< unsigned int, unsigned int > get_output_shape (int num_rows, int num_cols, bool padding_same) const =0
 Get the output shape of a convolution. More...
 
virtual void configure (const ITensorInfo *biases, const ITensorInfo *transformed_output, const int matrix_stride, ITensorInfo *output_nhwc, const int num_batches, const int num_rows, const int num_cols, const int num_channels, ITensorInfo *workspace, const arm_gemm::Activation &activation)=0
 Configure the output transform kernel. More...
 
virtual ~ICpuWinogradConv2dTransformOutputKernel ()
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run (const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
virtual size_t get_mws (const CPUInfo &platform, size_t thread_count) const
 Return minimum workload size of the relevant kernel. More...
 
virtual const char * name () const =0
 Name of the kernel. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Additional Inherited Members

- Static Public Attributes inherited from ICPPKernel
static constexpr size_t default_mws = 128
 
static constexpr size_t small_network_mws = 256
 

Detailed Description

Interface for the kernel to perform Winograd output transform.

Definition at line 219 of file CpuWinogradConv2dKernel.h.

Constructor & Destructor Documentation

◆ ~ICpuWinogradConv2dTransformOutputKernel()

virtual ~ICpuWinogradConv2dTransformOutputKernel ( )
inlinevirtual

Definition at line 296 of file CpuWinogradConv2dKernel.h.

297  {
298  }

Member Function Documentation

◆ configure()

virtual void configure ( const ITensorInfo biases,
const ITensorInfo transformed_output,
const int  matrix_stride,
ITensorInfo output_nhwc,
const int  num_batches,
const int  num_rows,
const int  num_cols,
const int  num_channels,
ITensorInfo workspace,
const arm_gemm::Activation activation 
)
pure virtual

Configure the output transform kernel.

Parameters
[in]biasesPointer to the biases tensor.
[in]transformed_outputPointer to working space for the output tensor in the Winograd domain.
[in]matrix_strideOutput matrix stride, can be computed with winograd::WinogradGEMM<2, 2, 3, 3>::Convolution<float, float>::get_output_matrix_stride()
[out]output_nhwcPointer to a tensor in NHWC data layout ordered output tensor, in the spatial domain.
[in]num_batchesNumber of batches in the input tensor.
[in]num_rowsNumber of rows in output tensor.
[in]num_colsNumber of columns in output tensor.
[in]num_channelsNumber of feature maps in the output tensor.
[in]workspaceTensor to be used as the working space during the computation.
[in]activationActivation to be used

Implemented in CpuWinogradConv2dTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.

◆ get_matrix_stride()

virtual int get_matrix_stride ( int  num_batches,
int  num_rows,
int  num_cols,
int  num_output_channels 
) const
pure virtual

Gets the stride between matrices in the output worspace.

Parameters
[in]num_batchesNumber of batches in the output tensor.
[in]num_rowsNumber of rows in each feature map of the input tensor.
[in]num_colsNumber of columns in each feature map of the input tensor.
[in]num_output_channelsNumber of feature maps in the output tensor.
Returns
Stride expressed in bytes.

Implemented in CpuWinogradConv2dTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.

◆ get_output_shape()

virtual std::pair<unsigned int, unsigned int> get_output_shape ( int  num_rows,
int  num_cols,
bool  padding_same 
) const
pure virtual

Get the output shape of a convolution.

Parameters
[in]num_rowsNumber of rows in each feature map of the input tensor.
[in]num_colsNumber of columns in each feature map of the input tensor.
[in]padding_sameTrue if padding is SAME, false otherwise
Returns
Shape of the output tensor

Implemented in CpuWinogradConv2dTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.

◆ get_output_storage_size()

virtual unsigned int get_output_storage_size ( int  num_batches,
int  num_rows,
int  num_cols,
int  num_output_channels 
) const
pure virtual

Determine how much memory (in units of TOut) to allocate for the (Winograd domain) output.

Parameters
[in]num_batchesNumber of batches in the output tensor.
[in]num_rowsNumber of rows in each feature map of the input tensor.
[in]num_colsNumber of columns in each feature map of the input tensor.
[in]num_output_channelsNumber of feature maps in the output tensor.
Returns
Storage size (in units of TOut) required.

Implemented in CpuWinogradConv2dTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.

◆ get_working_space_size()

virtual unsigned int get_working_space_size ( unsigned int  num_threads) const
pure virtual

Get the working space required to perform the transformation.

Note, the working space is only required when performing the transformation - hence it can be reused whenever the transformation is not running.

Parameters
[in]num_threadsThe greatest number of threads that will be used to execute the transform.
Returns
Size of working space required in bytes.

Implemented in CpuWinogradConv2dTransformOutputKernel< T, OutputTileRows, OutputTileCols, KernelRows, KernelCols >.


The documentation for this class was generated from the following file: