Compute Library
 21.02
GCConvolutionLayer Class Reference

Basic function to compute the convolution layer. More...

#include <GCConvolutionLayer.h>

Collaboration diagram for GCConvolutionLayer:
[legend]

Public Member Functions

 GCConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 GCConvolutionLayer (const GCConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 GCConvolutionLayer (GCConvolutionLayer &&)=default
 Default move constructor. More...
 
GCConvolutionLayeroperator= (const GCConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
GCConvolutionLayeroperator= (GCConvolutionLayer &&)=default
 Default move assignment operator. More...
 
void configure (const IGCTensor *input, const IGCTensor *weights, const IGCTensor *biases, IGCTensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), unsigned int num_groups=1)
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Detailed Description

Basic function to compute the convolution layer.

This function calls the following GLES kernels:

  1. GCWeightsReshapeKernel (executed only once for each configuration)
  2. GCGEMMTranspose1xWKernel (executed only once for each configuration)
  3. GCIm2ColKernel
  4. GCGEMMInterleave4x4Kernel
  5. GCCol2ImKernel
Deprecated:
This function is deprecated and is intended to be removed in 21.05 release

Definition at line 82 of file GCConvolutionLayer.h.

Constructor & Destructor Documentation

◆ GCConvolutionLayer() [1/3]

GCConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 68 of file GCConvolutionLayer.cpp.

69  : _memory_group(std::move(memory_manager)), _reshape_weights(), _input_im2col_kernel(), _mm_gemm(), _output_col2im_kernel(), _fill_border(), _activationlayer_function(), _original_weights(nullptr),
70  _input_im2col_reshaped(), _input_interleaved_reshaped(), _weights_reshaped(), _weights_transposed(), _gemm_output(), _tmp_output(), _is_activationlayer_enabled(false), _is_prepared(false)
71 {
72 }

◆ GCConvolutionLayer() [2/3]

GCConvolutionLayer ( const GCConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ GCConvolutionLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure()

void configure ( const IGCTensor input,
const IGCTensor weights,
const IGCTensor biases,
IGCTensor output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
unsigned int  num_groups = 1 
)

Set the input and output tensors.

Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match input data type, except for input of QASYMM8 type where biases should be of S32 type.
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with GCWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with GCGEMMTranspose1xWKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported

Definition at line 89 of file GCConvolutionLayer.cpp.

References ITensorAllocator::allocate(), GCTensor::allocator(), WeightsInfo::are_reshaped(), ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_ERROR_ON_MSG, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_UNUSED, GCFillBorderKernel::configure(), GCConvolutionLayerReshapeWeights::configure(), GCActivationLayer::configure(), GCCol2ImKernel::configure(), GCIm2ColKernel::configure(), arm_compute::CONSTANT, arm_compute::test::validation::conv_info, ITensorInfo::data_type(), ITensorInfo::dimension(), dt, ActivationLayerInfo::enabled(), ITensorInfo::extend_padding(), arm_compute::F16, arm_compute::F32, GCScheduler::get(), ITensor::info(), GCTensor::info(), ITensorAllocator::init(), MemoryGroup::manage(), ITensorInfo::num_dimensions(), arm_compute::scaled_dimensions(), TensorShape::set(), IGCKernel::set_target(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), and TensorInfo::tensor_shape().

91 {
92  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights);
95  ARM_COMPUTE_ERROR_ON_MSG(weights_info.are_reshaped(), "Weights already reshaped are not supported!");
96  ARM_COMPUTE_ERROR_ON(weights->info()->dimension(2) != input->info()->dimension(2));
97  ARM_COMPUTE_ERROR_ON(weights->info()->num_dimensions() > 4);
100 
101  _is_prepared = false;
102  _original_weights = weights;
103 
104  if(biases != nullptr)
105  {
107  ARM_COMPUTE_ERROR_ON(biases->info()->dimension(0) != weights->info()->dimension(3));
108  ARM_COMPUTE_ERROR_ON(biases->info()->num_dimensions() > 1);
109  }
110 
111  const DataType dt = input->info()->data_type();
112 
113  // Set the GPU target for im2col and col2im
114  _input_im2col_kernel.set_target(GCScheduler::get().get_target());
115  _output_col2im_kernel.set_target(GCScheduler::get().get_target());
116 
117  const bool append_bias = (biases != nullptr);
118  const unsigned bias_element = (append_bias) ? 1 : 0;
119  const IGCTensor *biases_to_use = (append_bias) ? biases : nullptr;
120 
121  // Get parameters from conv_info
122  unsigned int stride_x = 0;
123  unsigned int stride_y = 0;
124  std::tie(stride_x, stride_y) = conv_info.stride();
125 
126  // Get convolved dimensions
127  unsigned int conv_w = 0;
128  unsigned int conv_h = 0;
129 
130  const unsigned int kernel_width = weights->info()->dimension(0);
131  const unsigned int kernel_height = weights->info()->dimension(1);
132  std::tie(conv_w, conv_h) = scaled_dimensions(input->info()->dimension(0), input->info()->dimension(1), kernel_width, kernel_height,
133  conv_info, dilation);
134 
135  unsigned int mat_weights_cols = weights->info()->dimension(3);
136  unsigned int mat_weights_rows = weights->info()->dimension(0) * weights->info()->dimension(1) * weights->info()->dimension(2) + bias_element;
137 
138  // _weights_reshaped will be auto configured in the kernel.
139  // Just append biases and do not transpose 1xW as it will be reshaped in GCGEMM
140  _reshape_weights.configure(weights, biases_to_use, &_weights_reshaped);
141 
142  weights = &_weights_reshaped;
143 
144  // Create tensor to store im2col reshaped inputs
145  const unsigned int mat_input_cols = mat_weights_rows;
146  const unsigned int mat_input_rows = conv_w * conv_h;
147  TensorShape shape_im2col = input->info()->tensor_shape();
148  shape_im2col.set(0, mat_input_cols);
149  shape_im2col.set(1, mat_input_rows);
150  shape_im2col.set(2, 1);
151 
152  // FIXME: input->clone() doesn't work with subtensors for grouped convolutions.
153  TensorInfo im2col_reshaped_info(shape_im2col, 1, dt);
154  _input_im2col_reshaped.allocator()->init(im2col_reshaped_info);
155  _memory_group.manage(&_input_im2col_reshaped);
156 
157  // Create GEMM output tensor
158  TensorShape shape_gemm = _input_im2col_reshaped.info()->tensor_shape();
159  shape_gemm.set(0, mat_weights_cols);
160  shape_gemm.set(1, mat_input_rows);
161  const DataType gemm_data_type = dt;
162 
163  // FIXME: input->clone() doesn't work with subtensors for grouped convolutions.
164  TensorInfo info_gemm(shape_gemm, 1, gemm_data_type);
165  _gemm_output.allocator()->init(info_gemm);
166  _memory_group.manage(&_gemm_output);
167 
168  if(dt == DataType::F16)
169  {
170  BorderSize border_size = BorderSize(conv_info.pad_top(), conv_info.pad_right(), conv_info.pad_bottom(), conv_info.pad_left());
171  input->info()->extend_padding(border_size);
172  _fill_border.configure(input, border_size, BorderMode::CONSTANT, PixelValue()); // for PAD of im2col fp16: consider it as border
173  }
174  // Configure im2col
175  _input_im2col_kernel.configure(input, &_input_im2col_reshaped, Size2D(kernel_width, kernel_height), conv_info, append_bias, dilation);
176 
177  // Configure GEMM
178  configure_mm(&_input_im2col_reshaped, weights, &_gemm_output);
179 
180  _input_im2col_reshaped.allocator()->allocate();
181 
182  // Configure Col2Im
183  _output_col2im_kernel.configure(&_gemm_output, output, std::make_pair(conv_w, conv_h));
184  _gemm_output.allocator()->allocate();
185 
186  ARM_COMPUTE_ERROR_ON_MSG((output->info()->dimension(0) != conv_w) || (output->info()->dimension(1) != conv_h), "Output shape does not match the expected one");
187 
188  //Configure Activation Layer
189  _is_activationlayer_enabled = act_info.enabled();
190 
191  if(_is_activationlayer_enabled)
192  {
193  _activationlayer_function.configure(output, nullptr, act_info);
194  }
195 
196  ARM_COMPUTE_UNUSED(weights_info);
197 }
void configure(const IGCTensor *input, IGCTensor *output, const Size2D &kernel_dims, const PadStrideInfo &conv_info, bool has_bias, const Size2D &dilation=Size2D(1U, 1U))
Set the input and output of the kernel.
virtual size_t num_dimensions() const =0
The number of dimensions of the tensor (rank)
Class describing the value of a pixel for any image format.
Definition: PixelValue.h:34
Shape of a tensor.
Definition: TensorShape.h:39
bool enabled() const
Check if initialised.
Definition: Types.h:1600
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
Container for 2D border size.
Definition: Types.h:273
virtual DataType data_type() const =0
Data type used for each element of the tensor.
bool are_reshaped() const
Flag which specifies if the weights tensor has been reshaped.
Definition: Types.h:1789
1 channel, 1 F32 per channel
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
Interface for GLES Compute tensor.
Definition: IGCTensor.h:35
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
1 channel, 1 F16 per channel
std::pair< unsigned int, unsigned int > scaled_dimensions(int width, int height, int kernel_width, int kernel_height, const PadStrideInfo &pad_stride_info, const Size2D &dilation=Size2D(1U, 1U))
Returns expected width and height of output scaled tensor depending on dimensions rounding mode...
Definition: Utils.cpp:419
void set_target(GPUTarget target)
Set the targeted GPU architecture.
Definition: IGCKernel.h:113
DataType dt
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
static GCScheduler & get()
Access the scheduler singleton.
Definition: GCScheduler.cpp:70
void configure(const IGCTensor *weights, const IGCTensor *biases, IGCTensor *output)
Set the input and output tensors.
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
#define ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:543
#define ARM_COMPUTE_ERROR_ON_MSG(cond, msg)
Definition: Error.h:456
const unsigned int num_groups
Definition: Im2Col.cpp:153
std::pair< unsigned int, unsigned int > stride() const
Get the stride.
Definition: Types.h:770
void configure(const IGCTensor *tensor, BorderSize border_size, BorderMode border_mode, const PixelValue &constant_border_value=PixelValue())
Initialise the kernel&#39;s input, output and border mode.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:790
virtual void allocate()=0
Interface to be implemented by the child class to allocate the tensor.
void configure(IGCTensor *input, IGCTensor *output, ActivationLayerInfo act_info)
Set the input and output tensor.
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor&#39;s metadata.
Definition: GCTensor.cpp:39
Class for specifying the size of an image or rectangle.
Definition: Size2D.h:34
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
Store the tensor&#39;s metadata.
Definition: TensorInfo.h:45
const TensorShape & tensor_shape() const override
Size for each dimension of the tensor.
Definition: TensorInfo.h:262
DataType
Available data types.
Definition: Types.h:77
ITensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: GCTensor.cpp:34
virtual bool extend_padding(const PaddingSize &padding)=0
Update the offset to the first element, the strides and the total size.
TensorShape & set(size_t dimension, size_t value, bool apply_dim_correction=true, bool increase_dim_unit=true)
Accessor to set the value of one of the dimensions.
Definition: TensorShape.h:79
void configure(const IGCTensor *input, IGCTensor *output, std::pair< unsigned int, unsigned int > convolved_dims)
Set the input and output of the kernel.

◆ operator=() [1/2]

GCConvolutionLayer& operator= ( const GCConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

GCConvolutionLayer& operator= ( GCConvolutionLayer &&  )
default

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 225 of file GCConvolutionLayer.cpp.

References ITensorAllocator::allocate(), GCTensor::allocator(), ARM_COMPUTE_ERROR_ON, ITensor::is_used(), ITensor::mark_as_unused(), and GCConvolutionLayerReshapeWeights::run().

Referenced by GCConvolutionLayer::run().

226 {
227  if(!_is_prepared)
228  {
229  ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
230 
231  // Run weights reshaping and mark as unused
232  _weights_reshaped.allocator()->allocate();
233  _reshape_weights.run();
234 
235  // Mark original weights tensor as unused
236  _original_weights->mark_as_unused();
237 
238  _is_prepared = true;
239  }
240 }
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:163
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:168
void run() override
Run the kernels contained in the function.
virtual void allocate()=0
Interface to be implemented by the child class to allocate the tensor.
ITensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: GCTensor.cpp:34

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 199 of file GCConvolutionLayer.cpp.

References GCScheduler::dispatch(), GCScheduler::get(), GCScheduler::memory_barrier(), GCConvolutionLayer::prepare(), IGCSimpleFunction::run(), and GCGEMM::run().

200 {
201  prepare();
202 
203  MemoryGroupResourceScope scope_mg(_memory_group);
204 
205  // Run im2col
206  GCScheduler::get().dispatch(_fill_border);
208  GCScheduler::get().dispatch(_input_im2col_kernel);
209 
210  // Run gemm on reshaped matrices
211  _mm_gemm.run();
213 
214  // Reshape output matrix
215  GCScheduler::get().dispatch(_output_col2im_kernel, false);
217 
218  // Run Activation Layer
219  if(_is_activationlayer_enabled)
220  {
221  _activationlayer_function.run();
222  }
223 }
void prepare() override
Prepare the function for executing.
void dispatch(IGCKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
Definition: GCScheduler.cpp:77
void run() override
Run the kernels contained in the function.
Definition: GCGEMM.cpp:161
void run() override final
Run the kernels contained in the function.
void memory_barrier()
Defines a barrier ordering memory transactions.
Definition: GCScheduler.cpp:86
static GCScheduler & get()
Access the scheduler singleton.
Definition: GCScheduler.cpp:70
Memory group resources scope handling class.
Definition: IMemoryGroup.h:82

The documentation for this class was generated from the following files: