Compute Library
 19.08
GCConvolutionLayer Class Reference

Basic function to compute the convolution layer. More...

#include <GCConvolutionLayer.h>

Collaboration diagram for GCConvolutionLayer:
[legend]

Public Member Functions

 GCConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 GCConvolutionLayer (const GCConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 GCConvolutionLayer (GCConvolutionLayer &&)=default
 Default move constructor. More...
 
GCConvolutionLayeroperator= (const GCConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
GCConvolutionLayeroperator= (GCConvolutionLayer &&)=default
 Default move assignment operator. More...
 
void configure (const IGCTensor *input, const IGCTensor *weights, const IGCTensor *biases, IGCTensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), unsigned int num_groups=1)
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Detailed Description

Basic function to compute the convolution layer.

This function calls the following GLES kernels:

  1. GCWeightsReshapeKernel (executed only once for each configuration)
  2. GCGEMMTranspose1xWKernel (executed only once for each configuration)
  3. GCIm2ColKernel
  4. GCGEMMInterleave4x4Kernel
  5. GCCol2ImKernel

Definition at line 76 of file GCConvolutionLayer.h.

Constructor & Destructor Documentation

◆ GCConvolutionLayer() [1/3]

GCConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 69 of file GCConvolutionLayer.cpp.

70  : _memory_group(std::move(memory_manager)), _reshape_weights(), _input_im2col_kernel(), _mm_gemm(), _output_col2im_kernel(), _fill_border(), _activationlayer_function(), _original_weights(nullptr),
71  _input_im2col_reshaped(), _input_interleaved_reshaped(), _weights_reshaped(), _weights_transposed(), _gemm_output(), _tmp_output(), _is_activationlayer_enabled(false), _is_prepared(false)
72 {
73 }

◆ GCConvolutionLayer() [2/3]

GCConvolutionLayer ( const GCConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ GCConvolutionLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure()

void configure ( const IGCTensor input,
const IGCTensor weights,
const IGCTensor biases,
IGCTensor output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
unsigned int  num_groups = 1 
)

Set the input and output tensors.

Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match input data type, except for input of QASYMM8 type where biases should be of S32 type.
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with GCWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with GCGEMMTranspose1xWKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported

Definition at line 90 of file GCConvolutionLayer.cpp.

92 {
96  ARM_COMPUTE_ERROR_ON_MSG(weights_info.are_reshaped(), "Weights already reshaped are not supported!");
97  ARM_COMPUTE_ERROR_ON(weights->info()->dimension(2) != input->info()->dimension(2));
101 
102  _is_prepared = false;
103  _original_weights = weights;
104 
105  if(biases != nullptr)
106  {
108  ARM_COMPUTE_ERROR_ON(biases->info()->dimension(0) != weights->info()->dimension(3));
109  ARM_COMPUTE_ERROR_ON(biases->info()->num_dimensions() > 1);
110  }
111 
112  const DataType dt = input->info()->data_type();
113 
114  // Set the GPU target for im2col and col2im
115  _input_im2col_kernel.set_target(GCScheduler::get().get_target());
116  _output_col2im_kernel.set_target(GCScheduler::get().get_target());
117 
118  const bool append_bias = (biases != nullptr);
119  const unsigned bias_element = (append_bias) ? 1 : 0;
120  const IGCTensor *biases_to_use = (append_bias) ? biases : nullptr;
121 
122  // Get parameters from conv_info
123  unsigned int stride_x = 0;
124  unsigned int stride_y = 0;
125  std::tie(stride_x, stride_y) = conv_info.stride();
126 
127  // Get convolved dimensions
128  unsigned int conv_w = 0;
129  unsigned int conv_h = 0;
130 
131  const unsigned int kernel_width = weights->info()->dimension(0);
132  const unsigned int kernel_height = weights->info()->dimension(1);
133  std::tie(conv_w, conv_h) = scaled_dimensions(input->info()->dimension(0), input->info()->dimension(1), kernel_width, kernel_height,
135 
136  unsigned int mat_weights_cols = weights->info()->dimension(3);
137  unsigned int mat_weights_rows = weights->info()->dimension(0) * weights->info()->dimension(1) * weights->info()->dimension(2) + bias_element;
138 
139  // _weights_reshaped will be auto configured in the kernel.
140  // Just append biases and do not transpose 1xW as it will be reshaped in GCGEMM
141  _reshape_weights.configure(weights, biases_to_use, &_weights_reshaped);
142 
143  weights = &_weights_reshaped;
144 
145  // Create tensor to store im2col reshaped inputs
146  const unsigned int mat_input_cols = mat_weights_rows;
147  const unsigned int mat_input_rows = conv_w * conv_h;
148  TensorShape shape_im2col = input->info()->tensor_shape();
149  shape_im2col.set(0, mat_input_cols);
150  shape_im2col.set(1, mat_input_rows);
151  shape_im2col.set(2, 1);
152 
153  // FIXME: input->clone() doesn't work with subtensors for grouped convolutions.
154  TensorInfo im2col_reshaped_info(shape_im2col, 1, dt);
155  _input_im2col_reshaped.allocator()->init(im2col_reshaped_info);
156  _memory_group.manage(&_input_im2col_reshaped);
157 
158  // Create GEMM output tensor
159  TensorShape shape_gemm = _input_im2col_reshaped.info()->tensor_shape();
160  shape_gemm.set(0, mat_weights_cols);
161  shape_gemm.set(1, mat_input_rows);
162  const DataType gemm_data_type = dt;
163 
164  // FIXME: input->clone() doesn't work with subtensors for grouped convolutions.
165  TensorInfo info_gemm(shape_gemm, 1, gemm_data_type);
166  _gemm_output.allocator()->init(info_gemm);
167  _memory_group.manage(&_gemm_output);
168 
169  if(dt == DataType::F16)
170  {
171  BorderSize border_size = BorderSize(conv_info.pad_top(), conv_info.pad_right(), conv_info.pad_bottom(), conv_info.pad_left());
172  input->info()->extend_padding(border_size);
173  _fill_border.configure(input, border_size, BorderMode::CONSTANT, PixelValue()); // for PAD of im2col fp16: consider it as border
174  }
175  // Configure im2col
176  _input_im2col_kernel.configure(input, &_input_im2col_reshaped, Size2D(kernel_width, kernel_height), conv_info, append_bias, dilation);
177 
178  // Configure GEMM
179  configure_mm(&_input_im2col_reshaped, weights, &_gemm_output);
180 
181  _input_im2col_reshaped.allocator()->allocate();
182 
183  // Configure Col2Im
184  _output_col2im_kernel.configure(&_gemm_output, output, std::make_pair(conv_w, conv_h));
185  _gemm_output.allocator()->allocate();
186 
187  ARM_COMPUTE_ERROR_ON_MSG((output->info()->dimension(0) != conv_w) || (output->info()->dimension(1) != conv_h), "Output shape does not match the expected one");
188 
189  //Configure Activation Layer
190  _is_activationlayer_enabled = act_info.enabled();
191 
192  if(_is_activationlayer_enabled)
193  {
194  _activationlayer_function.configure(output, nullptr, act_info);
195  }
196 
198 }
void configure(const IGCTensor *input, IGCTensor *output, const Size2D &kernel_dims, const PadStrideInfo &conv_info, bool has_bias, const Size2D &dilation=Size2D(1U, 1U))
Set the input and output of the kernel.
virtual size_t num_dimensions() const =0
The number of dimensions of the tensor (rank)
Class describing the value of a pixel for any image format.
Definition: PixelValue.h:34
Shape of a tensor.
Definition: TensorShape.h:39
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: CLTensor.cpp:35
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
std::pair< unsigned int, unsigned int > scaled_dimensions(unsigned int width, unsigned int height, unsigned int kernel_width, unsigned int kernel_height, const PadStrideInfo &pad_stride_info, const Size2D &dilation=Size2D(1U, 1U))
Returns expected width and height of output scaled tensor depending on dimensions rounding mode.
Definition: Utils.cpp:387
#define ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:543
Container for 2D border size.
Definition: Types.h:259
size_t dimension(size_t index) const override
Return the size of the requested dimension.
Definition: TensorInfo.h:223
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:337
Interface for GLES Compute tensor.
Definition: IGCTensor.h:35
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
1 channel, 1 F16 per channel
void set_target(GPUTarget target)
Set the targeted GPU architecture.
Definition: IGCKernel.h:113
static GCScheduler & get()
Access the scheduler singleton.
Definition: GCScheduler.cpp:62
size_t num_dimensions() const override
The number of dimensions of the tensor (rank)
Definition: TensorInfo.h:244
void configure(const IGCTensor *weights, const IGCTensor *biases, IGCTensor *output)
Set the input and output tensors.
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
void manage(TensorType *obj)
Sets a object to be managed by the given memory group.
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
const unsigned int num_groups
Definition: Im2Col.cpp:148
void configure(const IGCTensor *tensor, BorderSize border_size, BorderMode border_mode, const PixelValue &constant_border_value=PixelValue())
Initialise the kernel's input, output and border mode.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
virtual void allocate()=0
Interface to be implemented by the child class to allocate the tensor.
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:789
void configure(IGCTensor *input, IGCTensor *output, ActivationLayerInfo act_info)
Set the input and output tensor.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: GCTensor.cpp:39
Class for specifying the size of an image or rectangle.
Definition: Size2D.h:34
TensorShape & set(size_t dimension, size_t value, bool apply_dim_correction=true)
Accessor to set the value of one of the dimensions.
Definition: TensorShape.h:78
Store the tensor's metadata.
Definition: TensorInfo.h:45
const TensorShape & tensor_shape() const override
Size for each dimension of the tensor.
Definition: TensorInfo.h:252
DataType
Available data types.
Definition: Types.h:74
ITensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: GCTensor.cpp:34
virtual bool extend_padding(const PaddingSize &padding)=0
Update the offset to the first element, the strides and the total size.
void configure(const IGCTensor *input, IGCTensor *output, std::pair< unsigned int, unsigned int > convolved_dims)
Set the input and output of the kernel.
#define ARM_COMPUTE_ERROR_ON_MSG(cond,...)
Definition: Error.h:328

References arm_compute::test::validation::act_info, ITensorAllocator::allocate(), GCTensor::allocator(), ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_ERROR_ON_MSG, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_UNUSED, GCActivationLayer::configure(), GCFillBorderKernel::configure(), GCConvolutionLayerReshapeWeights::configure(), GCCol2ImKernel::configure(), GCIm2ColKernel::configure(), arm_compute::CONSTANT, arm_compute::test::validation::conv_info, ITensorInfo::data_type(), arm_compute::test::validation::dilation, ITensorInfo::dimension(), TensorInfo::dimension(), ITensorInfo::extend_padding(), arm_compute::F16, arm_compute::F32, GCScheduler::get(), ITensor::info(), CLTensor::info(), GCTensor::info(), ITensorAllocator::init(), MemoryGroupBase< TensorType >::manage(), ITensorInfo::num_dimensions(), TensorInfo::num_dimensions(), arm_compute::test::validation::num_groups, arm_compute::scaled_dimensions(), TensorShape::set(), IGCKernel::set_target(), ITensorInfo::tensor_shape(), TensorInfo::tensor_shape(), arm_compute::test::validation::weights, and arm_compute::test::validation::weights_info.

◆ operator=() [1/2]

GCConvolutionLayer& operator= ( const GCConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

GCConvolutionLayer& operator= ( GCConvolutionLayer &&  )
default

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 226 of file GCConvolutionLayer.cpp.

227 {
228  if(!_is_prepared)
229  {
230  ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
231 
232  // Run weights reshaping and mark as unused
233  _weights_reshaped.allocator()->allocate();
234  _reshape_weights.run();
235 
236  // Mark original weights tensor as unused
237  _original_weights->mark_as_unused();
238 
239  _is_prepared = true;
240  }
241 }
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:162
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:337
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:167
void run() override
Run the kernels contained in the function.
virtual void allocate()=0
Interface to be implemented by the child class to allocate the tensor.
ITensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: GCTensor.cpp:34

References ITensorAllocator::allocate(), GCTensor::allocator(), ARM_COMPUTE_ERROR_ON, ITensor::is_used(), ITensor::mark_as_unused(), and GCConvolutionLayerReshapeWeights::run().

Referenced by GCConvolutionLayer::run().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 200 of file GCConvolutionLayer.cpp.

201 {
202  prepare();
203 
204  MemoryGroupResourceScope scope_mg(_memory_group);
205 
206  // Run im2col
207  GCScheduler::get().dispatch(_fill_border);
209  GCScheduler::get().dispatch(_input_im2col_kernel);
210 
211  // Run gemm on reshaped matrices
212  _mm_gemm.run();
214 
215  // Reshape output matrix
216  GCScheduler::get().dispatch(_output_col2im_kernel, false);
218 
219  // Run Activation Layer
220  if(_is_activationlayer_enabled)
221  {
222  _activationlayer_function.run();
223  }
224 }
void prepare() override
Prepare the function for executing.
void dispatch(IGCKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
Definition: GCScheduler.cpp:69
void run() override
Run the kernels contained in the function.
Definition: GCGEMM.cpp:161
void run() override final
Run the kernels contained in the function.
void memory_barrier()
Defines a barrier ordering memory transactions.
Definition: GCScheduler.cpp:78
static GCScheduler & get()
Access the scheduler singleton.
Definition: GCScheduler.cpp:62
Memory group resources scope handling class.
Definition: IMemoryGroup.h:46

References GCScheduler::dispatch(), GCScheduler::get(), GCScheduler::memory_barrier(), GCConvolutionLayer::prepare(), IGCSimpleFunction::run(), and GCGEMM::run().


The documentation for this class was generated from the following files: