Compute Library
 22.11
CLWinogradConvolutionLayer Class Reference

Basic function to execute Winograd-based convolution on OpenCL. More...

#include <CLWinogradConvolutionLayer.h>

Collaboration diagram for CLWinogradConvolutionLayer:
[legend]

Public Member Functions

 CLWinogradConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default Constructor. More...
 
 ~CLWinogradConvolutionLayer ()
 Default Destructor. More...
 
 CLWinogradConvolutionLayer (const CLWinogradConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLWinogradConvolutionLayer (CLWinogradConvolutionLayer &&)=default
 Default move constructor. More...
 
CLWinogradConvolutionLayeroperator= (const CLWinogradConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLWinogradConvolutionLayeroperator= (CLWinogradConvolutionLayer &&)=default
 Default move assignment operator. More...
 
void configure (ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Set the input and output tensors. More...
 
void configure (const CLCompileContext &compile_context, ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Static function to check if given info will lead to a valid configuration of CLWinogradConvolutionLayer. More...
 

Detailed Description

Basic function to execute Winograd-based convolution on OpenCL.

This function calls the following OpenCL functions/kernels:

  1. opencl::ClWinogradConv2d

Definition at line 44 of file CLWinogradConvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLWinogradConvolutionLayer() [1/3]

CLWinogradConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default Constructor.

Definition at line 49 of file CLWinogradConvolutionLayer.cpp.

References CLWinogradConvolutionLayer::~CLWinogradConvolutionLayer().

50  : _impl(std::make_unique<Impl>())
51 {
52  _impl->memory_group = MemoryGroup(memory_manager);
53 }

◆ ~CLWinogradConvolutionLayer()

◆ CLWinogradConvolutionLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLWinogradConvolutionLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)

Set the input and output tensors.

Valid data layouts:

  • NHWC
  • NCHW

Valid data type configurations:

src0 src1 src2 dst
F16 F16 F16 F16
F32 F32 F32 F32
Note
: This function only works with 3x3,3x1,1x3,5x5,5x1,1x5,7x1 and 1x7 kernels along with unit strides for both NCHW and NHWC data layout
Some Winograd configurations (i.e. F(4x4, 5x5)) are supported only with enable_fast_math = true
Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as input
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false

Definition at line 57 of file CLWinogradConvolutionLayer.cpp.

References CLKernelLibrary::get().

59 {
60  configure(CLKernelLibrary::get().get_compile_context(), input, weights, biases, output, conv_info, act_info, enable_fast_math);
61 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void configure(ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
Set the input and output tensors.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)

Set the input and output tensors.

Note
: This function only works with 3x3,3x1,1x3,5x5,5x1,1x5,7x1 and 1x7 kernels along with unit strides for both NCHW and NHWC data layout
Some Winograd configurations (i.e. F(4x4, 5x5)) are supported only with enable_fast_math = true
Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as input
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false

Definition at line 63 of file CLWinogradConvolutionLayer.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, arm_compute::test::validation::conv_info, ITensor::info(), and arm_compute::test::validation::input.

66 {
67  _impl->src = input;
68  _impl->weights = weights;
69  _impl->biases = biases;
70  _impl->dst = output;
71 
72  _impl->op = std::make_unique<opencl::ClWinogradConv2d>();
73  _impl->op->configure(compile_context, input->info(), weights->info(), (biases != nullptr ? biases->info() : nullptr), output->info(), conv_info, act_info, enable_fast_math);
74 
75  _impl->run_pack =
76  {
77  { TensorType::ACL_SRC_0, _impl->src },
78  { TensorType::ACL_SRC_1, _impl->weights },
79  { TensorType::ACL_SRC_2, _impl->biases },
80  { TensorType::ACL_DST, _impl->dst }
81  };
82  _impl->workspace_tensors = manage_workspace<CLTensor>(_impl->op->workspace(), _impl->memory_group, _impl->run_pack, _impl->run_pack);
83 }

◆ operator=() [1/2]

CLWinogradConvolutionLayer& operator= ( const CLWinogradConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 98 of file CLWinogradConvolutionLayer.cpp.

References arm_compute::ACL_SRC_1, and arm_compute::release_prepare_tensors().

Referenced by CLWinogradConvolutionLayer::run().

99 {
100  if(!_impl->is_prepared)
101  {
102  _impl->op->prepare(_impl->run_pack);
103 
104  // Release Preparation tensors
105  release_prepare_tensors(_impl->workspace_tensors, _impl->run_pack);
106  _impl->run_pack.remove_tensor(TensorType::ACL_SRC_1);
107  _impl->is_prepared = true;
108  }
109 }
void release_prepare_tensors(WorkspaceData< TensorType > &workspace, ITensorPack &prep_pack)

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For CPU kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 91 of file CLWinogradConvolutionLayer.cpp.

References CLWinogradConvolutionLayer::prepare().

92 {
93  MemoryGroupResourceScope scope_mg(_impl->memory_group);
94  prepare();
95  _impl->op->run(_impl->run_pack);
96 }
void prepare() override
Prepare the function for executing.

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)
static

Static function to check if given info will lead to a valid configuration of CLWinogradConvolutionLayer.

Note
: This function only works with 3x3,3x1,1x3,5x5,5x1 and 1x5 kernels along with unit strides for both NCHW and NHWC data layout
Some Winograd configurations (i.e. F(4x4, 5x5)) are supported only with enable_fast_math = true
Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as input
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
Returns
a status

Definition at line 85 of file CLWinogradConvolutionLayer.cpp.

References ClWinogradConv2d::validate().

Referenced by arm_compute::test::validation::DATA_TEST_CASE().

87 {
88  return opencl::ClWinogradConv2d::validate(input, weights, biases, output, conv_info, act_info, enable_fast_math);
89 }
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
Static function to check if given info will lead to a valid configuration.

The documentation for this class was generated from the following files: