Compute Library
 23.05
CLFullyConnectedLayer Class Reference

Basic function to compute a Fully Connected layer on OpenCL. More...

#include <CLFullyConnectedLayer.h>

Collaboration diagram for CLFullyConnectedLayer:
[legend]

Public Member Functions

 CLFullyConnectedLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
 Constructor. More...
 
 ~CLFullyConnectedLayer ()
 Default destructor. More...
 
 CLFullyConnectedLayer (const CLFullyConnectedLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLFullyConnectedLayer (CLFullyConnectedLayer &&)=default
 Default move constructor. More...
 
CLFullyConnectedLayeroperator= (const CLFullyConnectedLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLFullyConnectedLayeroperator= (CLFullyConnectedLayer &&)=default
 Default move assignment operator. More...
 
void configure (const CLCompileContext &compile_context, const ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 Set the input and output tensors. More...
 
void configure (const ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayer. More...
 

Detailed Description

Basic function to compute a Fully Connected layer on OpenCL.

This function calls the following OpenCL kernels:

  1. opencl::kernels::ClIm2ColKernel (called when the input comes from a convolutional layer)
  2. CLTranspose (if are_weights_reshaped is set to false and transpose_weights is set to true ) (called once)
  3. opencl::ClGemm or CLGEMMLowpMatrixMultiplyCore (if quantized asymmetric)
Note
The fully connected layer accepts "weights" tensors only with 2 dimensions.

Definition at line 43 of file CLFullyConnectedLayer.h.

Constructor & Destructor Documentation

◆ CLFullyConnectedLayer() [1/3]

CLFullyConnectedLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr,
IWeightsManager weights_manager = nullptr 
)

Constructor.

Definition at line 52 of file CLFullyConnectedLayer.cpp.

References CLFullyConnectedLayer::~CLFullyConnectedLayer().

53  : _impl(std::make_unique<Impl>())
54 {
55  _impl->memory_group = MemoryGroup(std::move(memory_manager));
56  _impl->weights_manager = weights_manager;
57 }

◆ ~CLFullyConnectedLayer()

~CLFullyConnectedLayer ( )
default

Default destructor.

Referenced by CLFullyConnectedLayer::CLFullyConnectedLayer().

◆ CLFullyConnectedLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLFullyConnectedLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( const CLCompileContext compile_context,
const ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)

Set the input and output tensors.

Valid data layouts:

  • NHWC
  • NCHW

Valid data type configurations:

src0 src1 src2 dst
F16 F16 F16 F16
F32 F32 F32 F32
QASYMM8 QASYMM8 S32 QASYMM8
QASYMM8_SIGNED QASYMM8_SIGNED S32 QASYMM8_SIGNED
Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor. The weights must be 2 dimensional. If this function is called after a Convolution Layer, the (transposed) weights will have as many rows as the product of the first 3 input's dimensions. If it is called after another FullyConnected Layer, the (transposed) weights will have as many rows as the input's first dimension. Data type supported: Same as input.
[in]biasesBias tensor. Can be nullptr. Data type supported:Same as input.
[out]outputDestination tensor. Its shape should be equal to the output of a matrix multiplication between:
  • The output of im2col on the input and the (transposed) 2D weights, if the function is called after a Convolution Layer
  • The input tensor and the (transposed) 2D weights, if the function is called after another FullyConnected Layer. Data type supported: Same as input.
[in]fc_info(Optional) Fully connected layer additional info

Definition at line 67 of file CLFullyConnectedLayer.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, ITensorInfo::are_values_constant(), FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ITensor::info(), FullyConnectedLayerInfo::retain_internal_weights, FullyConnectedLayerInfo::transpose_weights, and CLFullyConnectedLayer::validate().

Referenced by CLRNNLayer::configure(), CLFullyConnectedLayer::configure(), and CLLSTMLayer::configure().

69 {
70  // Perform validate step
71  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
73  weights->info(),
74  biases != nullptr ? biases->info() : nullptr,
75  output->info(),
76  fc_info));
77 
78  _impl->op = std::make_unique<opencl::ClFullyConnected>();
79  _impl->original_weights = weights;
80  _impl->is_prepared = fc_info.retain_internal_weights;
81 
82  _impl->op->configure(compile_context, input->info(), weights->info(), (biases != nullptr) ? biases->info() : nullptr, output->info(), fc_info);
83 
84  if(_impl->weights_manager != nullptr)
85  {
86  _impl->weights_manager->manage(_impl->original_weights);
87  }
88 
89  if(!_impl->is_prepared)
90  {
91  _impl->aux_mem_req = _impl->op->workspace();
92  _impl->run_pack = { { ACL_SRC_0, input }, { ACL_SRC_1, weights }, { ACL_SRC_2, biases }, { ACL_DST, output } };
93  _impl->workspace = manage_workspace<CLTensor>(_impl->aux_mem_req, _impl->memory_group, _impl->run_pack, _impl->run_pack);
94  }
95  else
96  {
97  _impl->run_pack.add_tensor(ACL_SRC_0, input);
98  _impl->run_pack.add_tensor(ACL_DST, output);
99  }
100 
101  _impl->dynamic_weights =
102  !weights->info()->are_values_constant() &&
103  fc_info.transpose_weights &&
104  !fc_info.are_weights_reshaped &&
105  !fc_info.retain_internal_weights;
106 }
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayer.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157

◆ configure() [2/2]

void configure ( const ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)

Set the input and output tensors.

Similar to CLFullyConnectedLayer

Definition at line 61 of file CLFullyConnectedLayer.cpp.

References CLFullyConnectedLayer::configure(), and CLKernelLibrary::get().

63 {
64  configure(CLKernelLibrary::get().get_compile_context(), input, weights, biases, output, fc_info);
65 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void configure(const CLCompileContext &compile_context, const ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Set the input and output tensors.

◆ operator=() [1/2]

CLFullyConnectedLayer& operator= ( const CLFullyConnectedLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

CLFullyConnectedLayer& operator= ( CLFullyConnectedLayer &&  )
default

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 125 of file CLFullyConnectedLayer.cpp.

References ITensor::is_used().

Referenced by CLRNNLayer::prepare(), and CLFullyConnectedLayer::run().

126 {
127  if(!_impl->is_prepared)
128  {
129  _impl->op->prepare(_impl->run_pack);
130 
131  // Release temporary tensors that are only used in prepare stage
132  release_temporaries<CLTensor>(_impl->aux_mem_req, _impl->workspace);
133  _impl->is_prepared = true;
134 
135  // Handle weights managed infrastructure
136  if(_impl->weights_manager != nullptr && _impl->weights_manager->are_weights_managed(_impl->original_weights))
137  {
138  // Ensure that b gets marked as unused (memory released) only after the last function which uses b also finishes its prepare
139  // This is for cases where multiple functions share the same b (weights)
140  // Therefore when a function marks original b as unused, we pre-mark it in weights manager, and mark it back to used so that it doesn't get released before its last reference
141  const ITensor *original_b = _impl->original_weights;
142  if(!original_b->is_used())
143  {
144  _impl->weights_manager->pre_mark_as_unused(original_b);
145  }
146  _impl->original_weights->mark_as_used();
147  _impl->weights_manager->release(_impl->original_weights);
148  }
149  }
150 }

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For CPU kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 114 of file CLFullyConnectedLayer.cpp.

References CLFullyConnectedLayer::prepare().

Referenced by CLRNNLayer::run(), and CLLSTMLayer::run().

115 {
116  if(!_impl->dynamic_weights)
117  {
118  prepare();
119  }
120 
121  MemoryGroupResourceScope scope_mg(_impl->memory_group);
122  _impl->op->run(_impl->run_pack);
123 }
void prepare() override
Prepare the function for executing.

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayer.

Similar to CLFullyConnectedLayer

Returns
a status

Definition at line 108 of file CLFullyConnectedLayer.cpp.

References ClFullyConnected::validate().

Referenced by CLFullyConnectedLayer::configure(), arm_compute::test::validation::DATA_TEST_CASE(), CLRNNLayer::validate(), and CLLSTMLayer::validate().

110 {
111  return opencl::ClFullyConnected::validate(input, weights, biases, output, fc_info);
112 }
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Static function to check if given info will lead to a valid configuration.

The documentation for this class was generated from the following files: