Compute Library
 20.02.1
CLFullyConnectedLayer Class Reference

Basic function to compute a Fully Connected layer on OpenCL. More...

#include <CLFullyConnectedLayer.h>

Collaboration diagram for CLFullyConnectedLayer:
[legend]

Public Member Functions

 CLFullyConnectedLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
 Constructor. More...
 
 CLFullyConnectedLayer (const CLFullyConnectedLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLFullyConnectedLayer (CLFullyConnectedLayer &&)=default
 Default move constructor. More...
 
CLFullyConnectedLayeroperator= (const CLFullyConnectedLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLFullyConnectedLayeroperator= (CLFullyConnectedLayer &&)=default
 Default move assignment operator. More...
 
void configure (const ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayer. More...
 

Detailed Description

Basic function to compute a Fully Connected layer on OpenCL.

This function calls the following OpenCL kernels:

  1. CLIm2ColKernel (called when the input comes from a convolutional layer)
  2. CLFullyConnectedLayerReshapeWeights (if are_weights_reshaped is set to false and transpose_weights is set to true ) (called once)
  3. CLGEMMMatrixMultiplyKernel or CLGEMMLowpMatrixMultiplyCore (if quantized asymmetric)
Note
The fully connected layer accepts "weights" tensors only with 2 dimensions.

Definition at line 121 of file CLFullyConnectedLayer.h.

Constructor & Destructor Documentation

◆ CLFullyConnectedLayer() [1/3]

CLFullyConnectedLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr,
IWeightsManager weights_manager = nullptr 
)

Constructor.

Definition at line 138 of file CLFullyConnectedLayer.cpp.

139  : _memory_group(memory_manager), _weights_manager(weights_manager), _convert_weights(), _convert_weights_managed(), _reshape_weights_managed_function(), _flatten_layer(), _reshape_weights_function(),
140  _mm_gemm(memory_manager, weights_manager), _mm_gemmlowp(memory_manager), _flatten_output(), _converted_weights_output(), _reshape_weights_output(), _are_weights_converted(true),
141  _are_weights_reshaped(true), _is_fc_after_conv(true), _is_quantized(false), _is_prepared(false), _original_weights(nullptr)
142 {
143 }

◆ CLFullyConnectedLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLFullyConnectedLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure()

void configure ( const ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)

Set the input and output tensors.

Parameters
[in]inputSource tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. The weights must be 2 dimensional. If this function is called after a Convolution Layer, the (transposed) weights will have as many rows as the product of the first 3 input's dimensions. If it is called after another FullyConnected Layer, the (transposed) weights will have as many rows as the input's first dimension. Data type supported: Same as input.
[in]biasesBias tensor. Can be nullptr. Data type supported:Same as input.
[out]outputDestination tensor. Its shape should be equal to the output of a matrix multiplication between:
  • The output of im2col on the input and the (transposed) 2D weights, if the function is called after a Convolution Layer
  • The input tensor and the (transposed) 2D weights, if the function is called after another FullyConnected Layer. Data type supported: Same as input.
[in]fc_info(Optional) Fully connected layer additional info

Definition at line 213 of file CLFullyConnectedLayer.cpp.

215 {
217 
218  // Perform validate step
220  weights->info(),
221  biases != nullptr ? biases->info() : nullptr,
222  output->info(),
223  fc_info));
224 
225  _are_weights_converted = true;
226  _are_weights_reshaped = fc_info.transpose_weights ? fc_info.are_weights_reshaped : true;
227  _is_fc_after_conv = true;
228  _is_quantized = is_data_type_quantized_asymmetric(input->info()->data_type());
229  _is_prepared = fc_info.retain_internal_weights;
230  _original_weights = weights;
231 
232  if(_weights_manager)
233  {
234  _weights_manager->manage(weights);
235  }
236 
237  const ICLTensor *weights_to_use = weights;
238 
239  // With the Fully Connected layer we can have 4 different cases:
240  // 1) Convolution layer -> Fully Connected layer without batches
241  // 2) Fully Connected layer -> Fully Connected layer without batches
242  // 3) Convolution layer -> Fully Connected layer with batches
243  // 4) Fully Connected layer -> Fully Connected layer with batches
244 
245  // Check if we have a fully connected layer with batches
246  const bool is_batched_fc_layer = output->info()->dimension(1) > 1;
247  if(is_batched_fc_layer)
248  {
249  _is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) && (std::equal(input->info()->tensor_shape().cbegin() + 3,
250  input->info()->tensor_shape().cend(),
251  output->info()->tensor_shape().cbegin() + 1));
252  }
253  else
254  {
255  _is_fc_after_conv = input->info()->num_dimensions() > 1;
256  }
257 
258  // Reshape weights if needed
259  if(!_are_weights_reshaped)
260  {
261  if(_weights_manager && _weights_manager->are_weights_managed(weights))
262  {
263  _reshape_weights_managed_function.configure(weights);
264  weights_to_use = utils::cast::polymorphic_downcast<ICLTensor *>(_weights_manager->acquire(weights, &_reshape_weights_managed_function));
265  }
266  else
267  {
268  // Reshape the weights
269  _reshape_weights_function.configure(weights, &_reshape_weights_output);
270  weights_to_use = &_reshape_weights_output;
271  }
272  }
273 
274  // Convert weights if needed
275  if(_is_fc_after_conv && (input->info()->data_layout() != fc_info.weights_trained_layout))
276  {
277  if(_weights_manager && _weights_manager->are_weights_managed(weights_to_use))
278  {
279  _convert_weights_managed.configure(weights_to_use,
280  input->info()->tensor_shape(),
281  fc_info.weights_trained_layout);
282  weights_to_use = utils::cast::polymorphic_downcast<ICLTensor *>(_weights_manager->acquire(weights, &_convert_weights_managed));
283  }
284  else
285  {
286  // Convert weights
287  _convert_weights.configure(weights_to_use,
288  &_converted_weights_output,
289  input->info()->tensor_shape(),
290  fc_info.weights_trained_layout);
291 
292  weights_to_use = &_converted_weights_output;
293  }
294  _are_weights_converted = false;
295  }
296 
297  if(_is_fc_after_conv)
298  {
299  // Fully Connected layer after a Convolution Layer without batches
300  configure_conv_fc(input, weights_to_use, biases, output, fc_info);
301  }
302  else
303  {
304  // Fully Connected layer after a Fully Connected Layer without batches
305  configure_fc_fc(input, weights_to_use, biases, output, fc_info);
306  }
307 }
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: CLTensor.cpp:41
size_t dimension(size_t index) const override
Return the size of the requested dimension.
Definition: TensorInfo.h:232
void configure(const ICLTensor *input)
Configures the CLFullyConnectedLayerReshapeWeights function.
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
void manage(const ITensor *weights, ITransformWeights *parent=nullptr)
Start managing a weights tensor.
bool are_weights_managed(const ITensor *weights)
Check if the weights are managed.
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayer.
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1139
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
void configure(const ICLTensor *input, ICLTensor *output, const TensorShape &original_input_shape, DataLayout data_layout)
Initialize the function.
void configure(const ICLTensor *input, ICLTensor *output)
Set the input and output tensors.
void configure(const ICLTensor *input, const TensorShape &original_input_shape, DataLayout data_layout)
Configures the CLConvertFullyConnectedWeights function.
static constexpr size_t num_max_dimensions
Number of dimensions the tensor has.
Definition: Dimensions.h:45
ITensor * acquire(const ITensor *weights, ITransformWeights *weights_transform)
Acquire the requested reshape tensor of the selected weights.

References IWeightsManager::acquire(), IWeightsManager::are_weights_managed(), FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, Dimensions< T >::cbegin(), CLConvertFullyConnectedWeights::configure(), CLFullyConnectedLayerReshapeWeights::configure(), CLConvertFullyConnectedWeightsManaged::configure(), CLFullyConnectedLayerReshapeWeightsManaged::configure(), ITensorInfo::dimension(), ITensor::info(), CLTensor::info(), arm_compute::test::validation::input, arm_compute::is_data_type_quantized_asymmetric(), IWeightsManager::manage(), Dimensions< uint32_t >::num_max_dimensions, FullyConnectedLayerInfo::retain_internal_weights, ITensorInfo::tensor_shape(), FullyConnectedLayerInfo::transpose_weights, CLFullyConnectedLayer::validate(), arm_compute::test::validation::weights, and FullyConnectedLayerInfo::weights_trained_layout.

Referenced by CLRNNLayer::configure(), CLLSTMLayer::configure(), and arm_compute::test::validation::DATA_TEST_CASE().

◆ operator=() [1/2]

CLFullyConnectedLayer& operator= ( const CLFullyConnectedLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

CLFullyConnectedLayer& operator= ( CLFullyConnectedLayer &&  )
default

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 407 of file CLFullyConnectedLayer.cpp.

408 {
409  if(!_is_prepared)
410  {
411  if(!_weights_manager)
412  {
413  ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
414  }
415 
416  auto release_unused = [](CLTensor * w)
417  {
418  if(!w->is_used())
419  {
420  CLScheduler::get().queue().finish();
421  w->allocator()->free();
422  }
423  };
424 
425  // Pointer to current weights
426  const ICLTensor *cur_weights = _original_weights;
427 
428  // Reshape of the weights if needed (happens only once)
429  if(!_are_weights_reshaped)
430  {
431  if(_weights_manager && _weights_manager->are_weights_managed(_original_weights))
432  {
433  cur_weights = utils::cast::polymorphic_downcast<ICLTensor *>(_weights_manager->run(cur_weights, &_reshape_weights_managed_function));
434  }
435  else
436  {
437  // Run reshape weights kernel and mark weights as unused
438  _reshape_weights_output.allocator()->allocate();
439  _reshape_weights_function.run();
440 
441  cur_weights->mark_as_unused();
442  cur_weights = &_reshape_weights_output;
443  }
444  _are_weights_reshaped = true;
445  }
446 
447  // Convert weights if needed (happens only once)
448  if(!_are_weights_converted)
449  {
450  if(_weights_manager && _weights_manager->are_weights_managed(cur_weights))
451  {
452  _weights_manager->run(cur_weights, &_convert_weights_managed);
453  }
454  else
455  {
456  _converted_weights_output.allocator()->allocate();
457  _convert_weights.run();
458  cur_weights->mark_as_unused();
459  }
460 
461  _are_weights_converted = true;
462  }
463 
464  // Release reshaped weights if unused
465  release_unused(&_reshape_weights_output);
466 
467  // Prepare GEMM prepare and release unused weights
468  if(!_is_quantized)
469  {
470  _mm_gemm.prepare();
471  }
472 
473  // Release converted weights if unused
474  release_unused(&_reshape_weights_output);
475  release_unused(&_converted_weights_output);
476 
477  _is_prepared = true;
478  }
479 }
SimpleTensor< float > w
Definition: DFT.cpp:156
void prepare() override
Prepare the function for executing.
Definition: CLGEMM.cpp:720
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:99
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:162
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
CLTensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: CLTensor.cpp:61
bool are_weights_managed(const ITensor *weights)
Check if the weights are managed.
void run() override final
Run the kernels contained in the function.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:41
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
ITensor * run(const ITensor *weights, ITransformWeights *weights_transform)
Run the reshape function.

References CLTensorAllocator::allocate(), CLTensor::allocator(), IWeightsManager::are_weights_managed(), ARM_COMPUTE_ERROR_ON, CLScheduler::get(), ITensor::is_used(), ITensor::mark_as_unused(), CLGEMM::prepare(), CLScheduler::queue(), ICLSimpleFunction::run(), IWeightsManager::run(), and arm_compute::test::validation::w.

Referenced by CLRNNLayer::prepare(), and CLFullyConnectedLayer::run().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 384 of file CLFullyConnectedLayer.cpp.

385 {
386  prepare();
387 
388  MemoryGroupResourceScope scope_mg(_memory_group);
389 
390  // Linearize input if it comes from a convolutional layer
391  if(_is_fc_after_conv)
392  {
393  _flatten_layer.run();
394  }
395 
396  // Run matrix multiply
397  if(_is_quantized)
398  {
399  _mm_gemmlowp.run();
400  }
401  else
402  {
403  _mm_gemm.run();
404  }
405 }
void run() override
Run the kernels contained in the function.
Definition: CLGEMM.cpp:639
void run() override
Run the kernels contained in the function.
void prepare() override
Prepare the function for executing.
void run() override final
Run the kernels contained in the function.

References CLFullyConnectedLayer::prepare(), ICLSimpleFunction::run(), CLGEMMLowpMatrixMultiplyCore::run(), and CLGEMM::run().

Referenced by CLRNNLayer::run(), and CLLSTMLayer::run().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayer.

Parameters
[in]inputSource tensor info. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor info. The weights must be 2 dimensional. If this function is called after a Convolution Layer, the (transposed) weights will have as many rows as the product of the first 3 input's dimensions. If it is called after another FullyConnected Layer, the (transposed) weights will have as many rows as the input's first dimension. Data type supported: Same as input.
[in]biasesBias tensor info. Can be nullptr. Data type supported:Same as input.
[out]outputDestination tensor info. Its shape should be equal to the output of a matrix multiplication between:
  • The output of im2col on the input and the (transposed) 2D weights, if the function is called after a Convolution Layer
  • The input tensor and the (transposed) 2D weights, if the function is called after another FullyConnected Layer. Data type supported: Same as input.
[in]fc_info(Optional) Fully connected layer additional info
Returns
a status

Definition at line 309 of file CLFullyConnectedLayer.cpp.

311 {
315  ARM_COMPUTE_RETURN_ERROR_ON(weights->num_dimensions() > 2);
316 
317  bool weights_reshaped = fc_info.transpose_weights ? fc_info.are_weights_reshaped : true;
318  bool is_fc_after_conv = true;
319 
320  const ITensorInfo &flatten_input = TensorInfo(input->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(compute_flatten_shape(input)).set_data_layout(DataLayout::NCHW));
321  const ITensorInfo &reshaped_weights = TensorInfo(weights->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(compute_transposed_shape(*weights)));
322  const ITensorInfo &converted_weights = weights_reshaped ? TensorInfo(weights->clone()->set_is_resizable(true).reset_padding()) : TensorInfo(*reshaped_weights.clone());
323 
324  // With the Fully Connected layer we can have 4 different cases:
325  // 1) Convolution layer -> Fully Connected layer without batches
326  // 2) Fully Connected layer -> Fully Connected layer without batches
327  // 3) Convolution layer -> Fully Connected layer with batches
328  // 4) Fully Connected layer -> Fully Connected layer with batches
329 
330  const ITensorInfo *input_to_use = input;
331  const ITensorInfo *weights_to_use = weights;
332 
333  // Check if we have a fully connected layer with batches
334  const bool is_batched_fc_layer = output->dimension(1) > 1;
335  if(is_batched_fc_layer)
336  {
337  is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) && (std::equal(input->tensor_shape().cbegin() + 3,
338  input->tensor_shape().cend(),
339  output->tensor_shape().cbegin() + 1));
340  }
341  else
342  {
343  is_fc_after_conv = input->num_dimensions() > 1;
344  }
345 
346  if(!weights_reshaped)
347  {
348  // Validate reshape weights kernel
350  weights_to_use = &reshaped_weights;
351  }
352 
353  if(is_fc_after_conv && (input->data_layout() != fc_info.weights_trained_layout))
354  {
355  // Validate convert weights kernel
357  &converted_weights,
358  input->tensor_shape(),
359  fc_info.weights_trained_layout));
360  weights_to_use = &converted_weights;
361  }
362 
363  if(is_fc_after_conv)
364  {
365  // Fully Connected layer after a Convolution Layer without batches
366  ARM_COMPUTE_RETURN_ERROR_ON((weights_to_use->dimension(1) != (input->dimension(0) * input->dimension(1) * input->dimension(2))));
367 
368  // Validate flatten kernel
370  input_to_use = &flatten_input;
371  }
372  else
373  {
374  // Fully Connected layer after a Fully Connected Layer without batches
375  ARM_COMPUTE_RETURN_ERROR_ON(input->dimension(0) != weights_to_use->dimension(1));
376  }
377 
378  // Validate matrix multiply kernel
379  ARM_COMPUTE_RETURN_ON_ERROR(validate_mm(*input_to_use, *weights_to_use, biases, *output, fc_info));
380 
381  return Status{};
382 }
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLFlattenLayer.
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:792
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
1 channel, 1 F16 per channel
TensorShape compute_transposed_shape(const ITensorInfo &input)
Calculate the transposed shape of a tensor.
TensorShape compute_flatten_shape(const ITensorInfo *input)
Calculate the flattened output shape of a tensor.
quantized, asymmetric fixed-point 8-bit number unsigned
Num samples, channels, height, width.
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const TensorShape &original_input_shape, DataLayout data_layout)
Static function to check if given info will lead to a valid configuration of CLConvertFullyConnectedW...
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
quantized, asymmetric fixed-point 8-bit number signed
static constexpr size_t num_max_dimensions
Number of dimensions the tensor has.
Definition: Dimensions.h:45
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayerRes...

References FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, Dimensions< T >::cbegin(), arm_compute::misc::shape_calculator::compute_flatten_shape(), arm_compute::misc::shape_calculator::compute_transposed_shape(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::test::validation::input, arm_compute::NCHW, Dimensions< uint32_t >::num_max_dimensions, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, ITensorInfo::tensor_shape(), FullyConnectedLayerInfo::transpose_weights, CLConvertFullyConnectedWeights::validate(), CLFlattenLayer::validate(), CLFullyConnectedLayerReshapeWeights::validate(), arm_compute::test::validation::weights, and FullyConnectedLayerInfo::weights_trained_layout.

Referenced by CLFullyConnectedLayer::configure(), CLRNNLayer::validate(), and CLLSTMLayer::validate().


The documentation for this class was generated from the following files: