Compute Library
 21.02
GCFullyConnectedLayer Class Reference

Basic function to compute a Fully Connected layer on OpenGL ES. More...

#include <GCFullyConnectedLayer.h>

Collaboration diagram for GCFullyConnectedLayer:
[legend]

Public Member Functions

 GCFullyConnectedLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
 Constructor. More...
 
 GCFullyConnectedLayer (const GCFullyConnectedLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 GCFullyConnectedLayer (GCFullyConnectedLayer &&)=default
 Default move constructor. More...
 
GCFullyConnectedLayeroperator= (const GCFullyConnectedLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
GCFullyConnectedLayeroperator= (GCFullyConnectedLayer &&)=default
 Default move assignment operator. More...
 
void configure (const IGCTensor *input, const IGCTensor *weights, const IGCTensor *biases, IGCTensor *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Detailed Description

Basic function to compute a Fully Connected layer on OpenGL ES.

This function calls the following OpenGL ES kernels:

  1. GCIm2ColKernel (called when the input comes from a convolutional layer)
  2. GCFullyConnectedLayerReshapeWeights (if are_weights_reshaped is set to false and transpose_weights is set to true ) (called once)
  3. GCGEMMMatrixMultiplyKernel
  4. GCGEMMMatrixAccumulateBiasesKernel (if biases is not equal to nullptr)
Note
The fully connected layer accepts "weights" tensors only with 2 dimensions.
Deprecated:
This function is deprecated and is intended to be removed in 21.05 release

Definition at line 70 of file GCFullyConnectedLayer.h.

Constructor & Destructor Documentation

◆ GCFullyConnectedLayer() [1/3]

GCFullyConnectedLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr,
IWeightsManager weights_manager = nullptr 
)

Constructor.

Definition at line 40 of file GCFullyConnectedLayer.cpp.

41  : _memory_group(std::move(memory_manager)), _weights_manager(std::move(weights_manager)), _im2col_kernel(), _reshape_weights_kernel(), _mm_kernel(), _accumulate_biases_kernel(), _im2col_output(),
42  _reshape_weights_output(), _original_weights(nullptr), _are_weights_reshaped(true), _is_fc_after_conv(true), _accumulate_biases(false)
43 {
44 }

◆ GCFullyConnectedLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ GCFullyConnectedLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure()

void configure ( const IGCTensor input,
const IGCTensor weights,
const IGCTensor biases,
IGCTensor output,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)

Set the input and output tensors.

Parameters
[in]inputSource tensor. Data type supported: F16/F32.
[in]weightsWeights tensor. The weights must be 2 dimensional. Data type supported: Same as input
[in]biasesBias tensor. It can be nullptr. Data type supported:Same as input.
[out]outputDestination tensor. Data type supported: Same as input.
[in]fc_info(Optional) Fully connected layer additional info

Definition at line 81 of file GCFullyConnectedLayer.cpp.

References FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES, Dimensions< T >::cbegin(), Dimensions< T >::cend(), GCGEMMMatrixAccumulateBiasesKernel::configure(), GCFullyConnectedLayerReshapeWeights::configure(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, GCTensor::gc_buffer(), ITensor::info(), ITensorInfo::num_dimensions(), Dimensions< size_t >::num_max_dimensions, FullyConnectedLayerInfo::retain_internal_weights, ITensorInfo::tensor_shape(), and FullyConnectedLayerInfo::transpose_weights.

83 {
85  ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES(input, weights, output);
86  ARM_COMPUTE_ERROR_ON(weights->info()->num_dimensions() > 2);
87 
88  _original_weights = weights;
89  _are_weights_reshaped = fc_info.transpose_weights ? fc_info.are_weights_reshaped : true;
90  _is_fc_after_conv = true;
91  _accumulate_biases = false;
92 
93  if(biases != nullptr)
94  {
96 
97  _accumulate_biases = true;
98 
99  // Configure accumulate biases kernel
100  _accumulate_biases_kernel.configure(output, biases);
101  }
102 
103  // With the Fully Connected layer we can have 4 different cases:
104  // 1) Convolution layer -> Fully Connected layer without batches
105  // 2) Fully Connected layer -> Fully Connected layer without batches
106  // 3) Convolution layer -> Fully Connected layer with batches
107  // 4) Fully Connected layer -> Fully Connected layer with batches
108 
109  const IGCTensor *weights_to_use = weights;
110 
111  if(!_are_weights_reshaped)
112  {
113  weights_to_use = &_reshape_weights_output;
114 
115  // Reshape the weights
116  _reshape_weights_kernel.configure(weights, &_reshape_weights_output);
117  }
118 
119  // Check if we have a fully connected layer with batches
120  const bool is_batched_fc_layer = output->info()->dimension(1) > 1;
121 
122  if(is_batched_fc_layer)
123  {
124  _is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) && (std::equal(input->info()->tensor_shape().cbegin() + 3,
125  input->info()->tensor_shape().cend(),
126  output->info()->tensor_shape().cbegin() + 1));
127  }
128  else
129  {
130  _is_fc_after_conv = input->info()->num_dimensions() > 1;
131  }
132 
133  if(_is_fc_after_conv)
134  {
135  // Fully Connected layer after a Convolution Layer without batches
136  configure_conv_fc(input, weights_to_use, output);
137  }
138  else
139  {
140  // Fully Connected layer after a Fully Connected Layer without batches
141  configure_fc_fc(input, weights_to_use, output);
142  }
143 
144  ARM_COMPUTE_ERROR_ON(fc_info.retain_internal_weights && _reshape_weights_output.gc_buffer() == 0);
145  _are_weights_reshaped = _are_weights_reshaped || fc_info.retain_internal_weights;
146 }
virtual size_t num_dimensions() const =0
The number of dimensions of the tensor (rank)
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
bool retain_internal_weights
Retain internal reshaped weights.
Definition: Types.h:1618
1 channel, 1 F32 per channel
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
Interface for GLES Compute tensor.
Definition: IGCTensor.h:35
1 channel, 1 F16 per channel
GLuint gc_buffer() const override
Interface to be implemented by the child class to return the tensor&#39;s gles compute buffer id...
Definition: GCTensor.cpp:54
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
#define ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:543
bool are_weights_reshaped
Reshape the weights tensor if false.
Definition: Types.h:1617
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:790
void configure(const IGCTensor *input, IGCTensor *output)
Set the input and output tensors.
std::array< T, num_max_dimensions >::const_iterator cend() const
Returns a read-only (constant) iterator that points one past the last element in the dimension array...
Definition: Dimensions.h:255
std::array< T, num_max_dimensions >::const_iterator cbegin() const
Returns a read-only (constant) iterator that points to the first element in the dimension array...
Definition: Dimensions.h:231
bool transpose_weights
Transpose weights if true.
Definition: Types.h:1616
void configure(IGCTensor *accum, const IGCTensor *biases)
Set the accumulate buffer and the biases of the kernel.
static constexpr size_t num_max_dimensions
Number of dimensions the tensor has.
Definition: Dimensions.h:46

◆ operator=() [1/2]

GCFullyConnectedLayer& operator= ( const GCFullyConnectedLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

GCFullyConnectedLayer& operator= ( GCFullyConnectedLayer &&  )
default

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 177 of file GCFullyConnectedLayer.cpp.

References ITensorAllocator::allocate(), GCTensor::allocator(), ARM_COMPUTE_ERROR_ON, ITensor::is_used(), ITensor::mark_as_unused(), and IGCSimpleFunction::run().

Referenced by GCFullyConnectedLayer::run().

178 {
179  // Reshape of the weights (happens only once)
180  if(!_are_weights_reshaped)
181  {
182  ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
183 
184  // Run reshape weights kernel and mark weights as unused
185  _reshape_weights_output.allocator()->allocate();
186  _reshape_weights_kernel.run();
187 
188  // Mark original weights tensor as unused
189  _original_weights->mark_as_unused();
190 
191  _are_weights_reshaped = true;
192  }
193 }
void run() override final
Run the kernels contained in the function.
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:163
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:168
virtual void allocate()=0
Interface to be implemented by the child class to allocate the tensor.
ITensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: GCTensor.cpp:34

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 148 of file GCFullyConnectedLayer.cpp.

References GCScheduler::dispatch(), GCScheduler::get(), GCScheduler::memory_barrier(), and GCFullyConnectedLayer::prepare().

149 {
150  prepare();
151 
152  MemoryGroupResourceScope scope_mg(_memory_group);
153 
154  // Linearize input if it comes from a convolutional layer
155  if(_is_fc_after_conv)
156  {
157  GCScheduler::get().dispatch(_im2col_kernel, false);
158  }
159 
160  if(!_are_weights_reshaped || _is_fc_after_conv)
161  {
163  }
164 
165  // Run matrix multiply
166  GCScheduler::get().dispatch(_mm_kernel, !_accumulate_biases);
167 
168  // Accumulate biases if provided
169  if(_accumulate_biases)
170  {
172 
173  GCScheduler::get().dispatch(_accumulate_biases_kernel);
174  }
175 }
void dispatch(IGCKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
Definition: GCScheduler.cpp:77
void memory_barrier()
Defines a barrier ordering memory transactions.
Definition: GCScheduler.cpp:86
static GCScheduler & get()
Access the scheduler singleton.
Definition: GCScheduler.cpp:70
void prepare() override
Prepare the function for executing.
Memory group resources scope handling class.
Definition: IMemoryGroup.h:82

The documentation for this class was generated from the following files: