Compute Library
 23.08
CpuFullyConnected Class Reference

Basic function to compute a Fully Connected layer. More...

#include <CpuFullyConnected.h>

Collaboration diagram for CpuFullyConnected:
[legend]

Public Member Functions

 CpuFullyConnected ()
 Constructor. More...
 
 ~CpuFullyConnected ()
 Destructor. More...
 
void configure (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, ITensorInfo *dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo(), const WeightsInfo &weights_info=WeightsInfo())
 Set the input and output tensors. More...
 
void run (ITensorPack &tensors) override
 Run the kernels contained in the function. More...
 
void prepare (ITensorPack &tensors) override
 Prepare the function for executing. More...
 
experimental::MemoryRequirements workspace () const override
 Return the memory requirements required by the workspace. More...
 
- Public Member Functions inherited from INEOperator
 INEOperator (IRuntimeContext *ctx=nullptr)
 Constructor. More...
 
 INEOperator (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 INEOperator (INEOperator &&)=default
 Default move constructor. More...
 
INEOperatoroperator= (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
INEOperatoroperator= (INEOperator &&)=default
 Default move assignment operator. More...
 
 ~INEOperator ()
 Default destructor. More...
 
- Public Member Functions inherited from IOperator
virtual ~IOperator ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo(), const WeightsInfo &weights_info=WeightsInfo())
 Static function to check if given info will lead to a valid configuration of CpuFullyConnected. More...
 
static Status has_opt_impl (arm_compute::WeightFormat &expected_weight_format, const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, FullyConnectedLayerInfo fc_info, WeightsInfo weights_info)
 Static function that queries whether there exists fixed-format kernel and if it exists it will return in the first argument in what format weights are expected to be reshaped as defined by WeightFormat class. More...
 

Detailed Description

Basic function to compute a Fully Connected layer.

This function calls the following kernels:

  1. kernels::CpuIm2ColKernel (called when the input comes from a convolutional layer)
  2. kernels::CpuTransposeKernel (if are_weights_reshaped is set to false and transpose_weights is set to true ) (called once)
  3. CpuGemm or CpuGemmLowpMatrixMultiplyCore (if quantized asymmetric)
  4. kernels::CpuGemmMatrixAdditionKernel or CpuGemmLowpOutputStage (if quantized asymmetric) (if biases is not equal to nullptr)
Note
The fully connected layer accepts "weights" tensors only with 2 dimensions.

Definition at line 55 of file CpuFullyConnected.h.

Constructor & Destructor Documentation

◆ CpuFullyConnected()

Constructor.

Definition at line 119 of file CpuFullyConnected.cpp.

120  : _flatten(nullptr),
121  _convert_weights(nullptr),
122  _transpose_weights(nullptr),
123  _mm_gemm(nullptr),
124  _mm_gemmlowp(nullptr),
125  _flattened_src(),
126  _converted_weights(),
127  _reshaped_weights(),
128  _trans_weights(),
129  _trans_weights_idx(AuxTensorIdx::Count),
130  _aux_mem(Count),
131  _needs_weights_conversion(false),
132  _needs_weights_reshape(false),
133  _is_fc_after_conv(false),
134  _is_quantized_asymmetric(false),
135  _is_prepared(false),
136  _enable_fast_math(false),
137  _fixed_format(false),
139  _dynamic_weights(false)
140 {
141 }

◆ ~CpuFullyConnected()

~CpuFullyConnected ( )
default

Destructor.

Member Function Documentation

◆ configure()

void configure ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
ITensorInfo dst,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo(),
const WeightsInfo weights_info = WeightsInfo() 
)

Set the input and output tensors.

Valid data layouts:

  • NHWC
  • NCHW

Valid data type configurations:

src0 src1 src2 dst
F16 F16 F16 F16
F32 F32 F32 F32
QASYMM8 QASYMM8 S32 QASYMM8
QASYMM8_SIGNED QASYMM8_SIGNED S32 QASYMM8_SIGNED
Parameters
[in]srcSource tensor info. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor info. The weights must be 2 dimensional. If this function is called after a Convolution Layer, the (transposed) weights will have as many rows as the product of the first 3 input's dimensions. If it is called after another FullyConnected Layer, the (transposed) weights will have as many rows as the input's first dimension. Data type supported: Same as src.
[in]biasesBias tensor info. Can be nullptr. Data type supported: Same as weights, S32 if weights is QASYMM8/QASYMM8_SIGNED.
[out]dstDestination tensor info. Its shape should be equal to the output of a matrix multiplication between:
  • The output of im2col on the input and the (transposed) 2D weights, if the function is called after a Convolution Layer
  • The input tensor and the (transposed) 2D weights, if the function is called after another FullyConnected Layer. Data type supported: Same as src.
[in]fc_info(Optional) Fully connected layer additional info
[in]weights_info(Optional) Stores neccessary compute information when weights are already reshaped

Definition at line 206 of file CpuFullyConnected.cpp.

208 {
209  // Perform validate step
212  weights,
213  biases != nullptr ? biases : nullptr,
214  dst,
215  fc_info,
216  weights_info));
217  ARM_COMPUTE_LOG_PARAMS(src, weights, biases, dst, fc_info);
218 
219  _needs_weights_conversion = false;
220  _needs_weights_reshape = fc_info.transpose_weights ? !fc_info.are_weights_reshaped : false;
221  _needs_weights_reshape = _needs_weights_reshape && !fc_info.retain_internal_weights;
222  _is_fc_after_conv = true;
223  _is_quantized_asymmetric = is_data_type_quantized_asymmetric(src->data_type());
224  _is_prepared = false;
225  _trans_weights_idx = AuxTensorIdx::Count;
226  _enable_fast_math = fc_info.enable_fast_math;
227  _fixed_format = weights_info.weight_format() != WeightFormat::UNSPECIFIED;
228  _weight_format = weights_info.weight_format();
229  _dynamic_weights = !weights->are_values_constant() && _needs_weights_reshape;
230 
231  // With the Fully Connected layer we can have 4 different cases:
232  // 1) Convolution layer -> Fully Connected layer without batches
233  // 2) Fully Connected layer -> Fully Connected layer without batches
234  // 3) Convolution layer -> Fully Connected layer with batches
235  // 4) Fully Connected layer -> Fully Connected layer with batches
236 
237  const ITensorInfo *weights_to_use = weights;
238 
239  // Check if we have a fully connected layer with batches
240  const bool is_batched_fc_layer = dst->dimension(1) > 1;
241  if(is_batched_fc_layer)
242  {
243  _is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) && (std::equal(src->tensor_shape().cbegin() + 3, src->tensor_shape().cend(), dst->tensor_shape().cbegin() + 1));
244  }
245  else
246  {
247  _is_fc_after_conv = src->num_dimensions() > 1;
248  }
249 
250  // Reshape weights if needed
251  if(_needs_weights_reshape)
252  {
253  // Reshape the weights
254  _transpose_weights = std::make_unique<kernels::CpuTransposeKernel>();
255  _transpose_weights->configure(weights, &_reshaped_weights);
256  _reshaped_weights.set_are_values_constant(weights->are_values_constant());
257 
258  weights_to_use = &_reshaped_weights;
259  _trans_weights_idx = AuxTensorIdx::TransposedWeights;
260  }
261 
262  // Convert weights if needed
263  if(_is_fc_after_conv && (src->data_layout() != fc_info.weights_trained_layout))
264  {
265  // Convert weights
266  _convert_weights = std::make_unique<CpuConvertFullyConnectedWeights>();
267  _convert_weights->configure(weights_to_use,
268  &_converted_weights,
269  src->tensor_shape(),
270  fc_info.weights_trained_layout);
271  _converted_weights.set_are_values_constant(weights_to_use->are_values_constant());
272 
273  weights_to_use = &_converted_weights;
274  _needs_weights_conversion = true;
275  _trans_weights_idx = AuxTensorIdx::ConvertedWeights;
276  }
277 
278  if(_is_fc_after_conv)
279  {
280  // Fully Connected layer after a Convolution Layer without batches
281  configure_conv_fc(src, weights_to_use, biases, dst, fc_info.activation_info);
282  }
283  else
284  {
285  // Fully Connected layer after a Fully Connected Layer without batches
286  configure_fc_fc(src, weights_to_use, biases, dst, fc_info.activation_info);
287  }
288 
289  // Retain the tensorinfo with the weights to use
290  if(_needs_weights_reshape || _needs_weights_conversion)
291  {
292  _trans_weights = *weights_to_use;
293  }
294 
295  // Set auxiliary memory requirements
296  auto gemm_mem_req = (_is_quantized_asymmetric) ? _mm_gemmlowp->workspace() : _mm_gemm->workspace();
297  for(unsigned int i = 0; i < gemm_mem_req.size(); ++i)
298  {
299  _aux_mem[i] = gemm_mem_req[i];
300  }
301 
302  if(_aux_mem[Pretranspose].size > 0)
303  {
304  // Release permuted weights at the end of prepare as they are further transposed by the assembly dispatch
305  // Do not release them if biases are dynamic and data type is quantized, since the weights tensor will be used for biases offset calculation
306  // Keep all the auxiliary tensors in case of dynamic weights as they are recalculated every time.
307  _aux_mem[TransposedWeights] = MemoryInfo(
308  offset_int_vec(TransposedWeights),
309  _dynamic_weights ? MemoryLifetime::Temporary :
310  (_is_quantized_asymmetric && biases && !(biases->are_values_constant())) ? MemoryLifetime::Persistent :
312  _reshaped_weights.total_size());
313 
314  _aux_mem[ConvertedWeights] = MemoryInfo(
315  offset_int_vec(ConvertedWeights),
316  _dynamic_weights ? MemoryLifetime::Temporary : MemoryLifetime::Prepare,
317  _converted_weights.total_size());
318  }
319  else
320  {
321  _aux_mem[TransposedWeights] = MemoryInfo(
322  offset_int_vec(TransposedWeights),
323  _dynamic_weights ? MemoryLifetime::Temporary :
324  _needs_weights_conversion ? MemoryLifetime::Prepare :
325  MemoryLifetime::Persistent,
326  _reshaped_weights.total_size());
327 
328  _aux_mem[ConvertedWeights] = MemoryInfo(
329  offset_int_vec(ConvertedWeights),
330  _dynamic_weights ? MemoryLifetime::Temporary : MemoryLifetime::Persistent,
331  _converted_weights.total_size());
332  }
333  _aux_mem[FlattenedSrc] = MemoryInfo(offset_int_vec(FlattenedSrc), MemoryLifetime::Temporary, _flattened_src.total_size());
334 }

References FullyConnectedLayerInfo::activation_info, ITensorInfo::are_values_constant(), FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, arm_compute::test::validation::dst, FullyConnectedLayerInfo::enable_fast_math, arm_compute::is_data_type_quantized_asymmetric(), Dimensions< size_t >::num_max_dimensions, arm_compute::offset_int_vec(), arm_compute::experimental::Prepare, FullyConnectedLayerInfo::retain_internal_weights, TensorInfo::set_are_values_constant(), arm_compute::test::validation::src, TensorInfo::total_size(), FullyConnectedLayerInfo::transpose_weights, arm_compute::UNSPECIFIED, CpuFullyConnected::validate(), arm_compute::test::validation::weights_info, and FullyConnectedLayerInfo::weights_trained_layout.

◆ has_opt_impl()

Status has_opt_impl ( arm_compute::WeightFormat expected_weight_format,
const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo dst,
FullyConnectedLayerInfo  fc_info,
WeightsInfo  weights_info 
)
static

Static function that queries whether there exists fixed-format kernel and if it exists it will return in the first argument in what format weights are expected to be reshaped as defined by WeightFormat class.

Apart from the first argument the rest of the arguments are the same as in CpuFullyConnectedLayer::validate() except that all arguments are required.

Returns
a status

Definition at line 336 of file CpuFullyConnected.cpp.

338 {
339  GEMMInfo gemm_info;
340  gemm_info.set_activation_info(fc_info.activation_info);
341  gemm_info.set_fast_math(fc_info.enable_fast_math);
342  gemm_info.set_fixed_format(weights_info.weight_format() != WeightFormat::UNSPECIFIED);
343  gemm_info.set_weight_format(weights_info.weight_format());
344 
345  return CpuGemm::has_opt_impl(expected_weight_format, src, weights, biases, dst, gemm_info);
346 }

References FullyConnectedLayerInfo::activation_info, arm_compute::test::validation::dst, FullyConnectedLayerInfo::enable_fast_math, arm_compute::test::validation::gemm_info, CpuGemm::has_opt_impl(), arm_compute::test::validation::src, arm_compute::UNSPECIFIED, and arm_compute::test::validation::weights_info.

Referenced by NEFullyConnectedLayer::has_opt_impl().

◆ prepare()

void prepare ( ITensorPack constants)
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Parameters
[in]constantsVector that contains the constants tensors.
Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from INEOperator.

Definition at line 487 of file CpuFullyConnected.cpp.

488 {
489  if(!_is_prepared || _dynamic_weights)
490  {
491 #ifdef ARM_COMPUTE_ASSERTS_ENABLED
492  ++_asrt_prepare_count;
493  ARM_COMPUTE_ERROR_ON(!_dynamic_weights && _asrt_prepare_count > 1);
494 #endif // ARM_COMPUTE_ASSERTS_ENABLED
495 
496  auto weights = tensors.get_const_tensor(ACL_SRC_1);
497 
498  CpuAuxTensorHandler reshaped_weights(offset_int_vec(TransposedWeights), _reshaped_weights, tensors, false);
499  CpuAuxTensorHandler converted_weights(offset_int_vec(ConvertedWeights), _converted_weights, tensors, false);
500 
501  // Pointer to current weights
502  const ITensor *cur_weights = weights;
503 
504  // Reshape of the weights (happens only once)
505  if(_needs_weights_reshape)
506  {
507  // Run reshape weights kernel and mark weights as unused
508  ITensorPack transpose_pack{ { ACL_SRC, weights }, { ACL_DST, reshaped_weights.get() } };
509  NEScheduler::get().schedule_op(_transpose_weights.get(), Window::DimY, _transpose_weights->window(), transpose_pack);
510 
511  cur_weights->mark_as_unused();
512  cur_weights = reshaped_weights.get();
513  }
514 
515  // Convert weights if needed (happens only once)
516  if(_needs_weights_conversion)
517  {
518  ITensorPack convert_pack{ { ACL_SRC, cur_weights }, { ACL_DST, converted_weights.get() } };
519  _convert_weights->run(convert_pack);
520 
521  cur_weights->mark_as_unused();
522  cur_weights = converted_weights.get();
523  }
524 
525  ITensorPack gemm_pack = tensors;
526  gemm_pack.add_const_tensor(ACL_SRC_1, cur_weights);
527 
528  // Prepare GEMM prepare and release unused weights
529  if(!_is_quantized_asymmetric)
530  {
531  _mm_gemm->prepare(gemm_pack);
532  }
533  else
534  {
535  _mm_gemmlowp->prepare(gemm_pack);
536  }
537 
538  _is_prepared = true;
539  }
540 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_1, ITensorPack::add_const_tensor(), ARM_COMPUTE_ERROR_ON, Window::DimY, Scheduler::get(), CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), ITensor::mark_as_unused(), arm_compute::offset_int_vec(), and IScheduler::schedule_op().

Referenced by CpuFullyConnected::run().

◆ run()

void run ( ITensorPack tensors)
overridevirtual

Run the kernels contained in the function.

Parameters
[in]tensorsVector that contains the tensors to operate on.

Reimplemented from INEOperator.

Definition at line 448 of file CpuFullyConnected.cpp.

449 {
450  prepare(tensors);
451 
452 #ifdef ARM_COMPUTE_ASSERTS_ENABLED
453  ++_asrt_run_count;
454  ARM_COMPUTE_ERROR_ON(_dynamic_weights && _asrt_prepare_count != _asrt_run_count);
455 #endif // ARM_COMPUTE_ASSERTS_ENABLED
456 
457  auto src = tensors.get_const_tensor(ACL_SRC_0);
458 
459  CpuAuxTensorHandler flattened_src(offset_int_vec(FlattenedSrc), _flattened_src, tensors, false);
460  CpuAuxTensorHandler transformed_wei(offset_int_vec(_trans_weights_idx), _trans_weights, tensors, false);
461 
462  // Linearize src if it comes from a convolutional layer
463  if(_is_fc_after_conv)
464  {
465  ITensorPack flatten_pack{ { ACL_SRC, src }, { ACL_DST, flattened_src.get() } };
466  _flatten->run(flatten_pack);
467  }
468 
469  ITensorPack gemm_pack = tensors;
470  gemm_pack.add_const_tensor(ACL_SRC_0, (_is_fc_after_conv) ? flattened_src.get() : src);
471  if(_needs_weights_reshape || _needs_weights_conversion)
472  {
473  gemm_pack.add_const_tensor(ACL_SRC_1, transformed_wei.get());
474  }
475 
476  // Run matrix multiply
477  if(_is_quantized_asymmetric)
478  {
479  _mm_gemmlowp->run(gemm_pack);
480  }
481  else
482  {
483  _mm_gemm->run(gemm_pack);
484  }
485 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, ITensorPack::add_const_tensor(), ARM_COMPUTE_ERROR_ON, CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), arm_compute::offset_int_vec(), CpuFullyConnected::prepare(), and arm_compute::test::validation::src.

◆ validate()

Status validate ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo dst,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo(),
const WeightsInfo weights_info = WeightsInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CpuFullyConnected.

Similar to CpuFullyConnected::configure()

Returns
a status

Definition at line 348 of file CpuFullyConnected.cpp.

350 {
351  ARM_COMPUTE_UNUSED(fc_info.retain_internal_weights);
354 
355  if (is_fixed_format_fast_math(weights_info.weight_format()))
356  {
360  }
361  else
362  {
364  }
365 
366  ARM_COMPUTE_RETURN_ERROR_ON(weights->num_dimensions() > 2);
367  ARM_COMPUTE_RETURN_ERROR_ON(fc_info.activation_info.enabled() && is_data_type_quantized(src->data_type()) && fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::RELU
368  && fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::BOUNDED_RELU && fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU);
369 
370  bool weights_reshaped = fc_info.transpose_weights ? fc_info.are_weights_reshaped : true;
371  bool is_fc_after_conv = true;
372 
373  const ITensorInfo &flatten_src = TensorInfo(src->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(compute_flatten_shape(src)));
374  const ITensorInfo &reshaped_weights = TensorInfo(weights->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(compute_transposed_shape(*weights)));
375  const ITensorInfo &converted_weights = weights_reshaped ? TensorInfo(weights->clone()->set_is_resizable(true).reset_padding()) : TensorInfo(*reshaped_weights.clone());
376 
377  // With the Fully Connected layer we can have 4 different cases:
378  // 1) Convolution layer -> Fully Connected layer without batches
379  // 2) Fully Connected layer -> Fully Connected layer without batches
380  // 3) Convolution layer -> Fully Connected layer with batches
381  // 4) Fully Connected layer -> Fully Connected layer with batches
382 
383  const ITensorInfo *src_to_use = src;
384  const ITensorInfo *weights_to_use = weights;
385 
386  // Check if we have a fully connected layer with batches
387  const bool is_batched_fc_layer = dst->dimension(1) > 1;
388 
389  if(biases != nullptr)
390  {
391  ARM_COMPUTE_RETURN_ERROR_ON(biases->num_dimensions() > 1);
392  if(is_data_type_quantized(src->data_type()))
393  {
395  }
396  else
397  {
399  }
400  }
401 
402  if(is_batched_fc_layer)
403  {
404  is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) && (std::equal(src->tensor_shape().cbegin() + 3, src->tensor_shape().cend(), dst->tensor_shape().cbegin() + 1));
405  }
406  else
407  {
408  is_fc_after_conv = src->num_dimensions() > 1;
409  }
410 
411  if(!weights_reshaped)
412  {
413  // Validate reshape weights kernel
415  weights_to_use = &reshaped_weights;
416  }
417 
418  if(is_fc_after_conv && (src->data_layout() != fc_info.weights_trained_layout))
419  {
420  // Validate convert weights kernel
422  &converted_weights,
423  src->tensor_shape(),
424  fc_info.weights_trained_layout));
425  weights_to_use = &converted_weights;
426  }
427 
428  if(is_fc_after_conv)
429  {
430  // Fully Connected layer after a Convolution Layer without batches
431  ARM_COMPUTE_RETURN_ERROR_ON((weights_to_use->dimension(1) != (src->dimension(0) * src->dimension(1) * src->dimension(2))));
432 
433  // Validate flatten kernel
435  src_to_use = &flatten_src;
436  }
437  else
438  {
439  // Fully Connected layer after a Fully Connected Layer without batches
440  ARM_COMPUTE_RETURN_ERROR_ON(src->dimension(0) != weights_to_use->dimension(1));
441  }
442  // Validate matrix multiply kernel
443  ARM_COMPUTE_RETURN_ON_ERROR(validate_mm(src_to_use, weights_to_use, biases, dst, fc_info.activation_info, fc_info.enable_fast_math, weights_info.weight_format()));
444 
445  return Status{};
446 }

References ActivationLayerInfo::activation(), FullyConnectedLayerInfo::activation_info, FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, ARM_COMPUTE_UNUSED, arm_compute::BFLOAT16, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_flatten_shape(), arm_compute::misc::shape_calculator::compute_transposed_shape(), ITensorInfo::dimension(), arm_compute::test::validation::dst, FullyConnectedLayerInfo::enable_fast_math, ActivationLayerInfo::enabled(), arm_compute::F16, arm_compute::F32, arm_compute::is_data_type_quantized(), arm_compute::is_fixed_format_fast_math(), ITensorInfo::num_dimensions(), Dimensions< size_t >::num_max_dimensions, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, FullyConnectedLayerInfo::retain_internal_weights, arm_compute::S32, arm_compute::test::validation::src, FullyConnectedLayerInfo::transpose_weights, CpuConvertFullyConnectedWeights::validate(), CpuTransposeKernel::validate(), CpuFlatten::validate(), arm_compute::test::validation::weights_info, and FullyConnectedLayerInfo::weights_trained_layout.

Referenced by CpuFullyConnected::configure(), and NEFullyConnectedLayer::validate().

◆ workspace()

experimental::MemoryRequirements workspace ( ) const
overridevirtual

Return the memory requirements required by the workspace.

Reimplemented from INEOperator.

Definition at line 542 of file CpuFullyConnected.cpp.

543 {
544  return _aux_mem;
545 }

The documentation for this class was generated from the following files:
arm_compute::test::validation::src
SimpleTensor< float > src
Definition: DFT.cpp:155
arm_compute::DataType::BFLOAT16
@ BFLOAT16
16-bit brain floating-point number
arm_compute::test::validation::weights_info
weights_info
Definition: BatchNormalizationLayer.cpp:165
arm_compute::DataType::QASYMM8
@ QASYMM8
quantized, asymmetric fixed-point 8-bit number unsigned
arm_compute::IScheduler::schedule_op
virtual void schedule_op(ICPPKernel *kernel, const Hints &hints, const Window &window, ITensorPack &tensors)=0
Runs the kernel in the same thread as the caller synchronously.
arm_compute::test::validation::dst
auto dst
Definition: DFT.cpp:170
arm_compute::experimental::MemoryLifetime::Prepare
@ Prepare
arm_compute::ACL_SRC_0
@ ACL_SRC_0
Definition: Types.h:45
arm_compute::misc::shape_calculator::compute_transposed_shape
TensorShape compute_transposed_shape(const ITensorInfo &input)
Calculate the transposed shape of a tensor.
Definition: ShapeCalculator.h:404
arm_compute::ACL_SRC_1
@ ACL_SRC_1
Definition: Types.h:46
ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:630
ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:877
ARM_COMPUTE_RETURN_ON_ERROR
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
ARM_COMPUTE_ERROR_ON_NULLPTR
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
arm_compute::experimental::MemoryInfo
Definition: Types.h:96
arm_compute::cpu::kernels::CpuTransposeKernel::validate
static Status validate(const ITensorInfo *src, const ITensorInfo *dst)
Static function to check if given info will lead to a valid configuration.
Definition: CpuTransposeKernel.cpp:629
ARM_COMPUTE_ERROR_ON
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:467
ARM_COMPUTE_ERROR_THROW_ON
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:456
ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_NOT_IN
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_NOT_IN(t,...)
Definition: Validate.h:779
arm_compute::test::validation::gemm_info
gemm_info
Definition: GEMMMatrixMultiplyReshaped.cpp:862
ARM_COMPUTE_RETURN_ERROR_ON
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:297
arm_compute::TensorInfo::total_size
size_t total_size() const override
Returns the total size of the tensor in bytes.
Definition: TensorInfo.h:251
arm_compute::ACL_DST
@ ACL_DST
Definition: Types.h:55
arm_compute::misc::shape_calculator::compute_flatten_shape
TensorShape compute_flatten_shape(const ITensorInfo *input)
Calculate the flattened output shape of a tensor.
Definition: ShapeCalculator.h:563
arm_compute::Scheduler::get
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:94
arm_compute::DataType::QASYMM8_SIGNED
@ QASYMM8_SIGNED
quantized, asymmetric fixed-point 8-bit number signed
arm_compute::cpu::CpuFlatten::validate
static Status validate(const ITensorInfo *src, const ITensorInfo *dst)
Static function to check if given info will lead to a valid configuration.
Definition: CpuFlatten.cpp:42
ARM_COMPUTE_UNUSED
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
arm_compute::TensorInfo::set_are_values_constant
ITensorInfo & set_are_values_constant(bool are_values_constant) override
Set the flag whether the tensor values can change during kernel/function execution.
Definition: TensorInfo.h:296
arm_compute::Window::DimY
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
arm_compute::cpu::CpuFullyConnected::prepare
void prepare(ITensorPack &tensors) override
Prepare the function for executing.
Definition: CpuFullyConnected.cpp:487
arm_compute::WeightFormat::UNSPECIFIED
@ UNSPECIFIED
arm_compute::cpu::CpuConvertFullyConnectedWeights::validate
static Status validate(const ITensorInfo *src, const ITensorInfo *dst, const TensorShape &original_src_shape, DataLayout data_layout)
Static function to check if given info will lead to a valid configuration.
Definition: CpuConvertFullyConnectedWeights.cpp:42
arm_compute::is_fixed_format_fast_math
bool is_fixed_format_fast_math(const WeightFormat &wf)
Definition: Types.h:1605
arm_compute::cpu::CpuFullyConnected::validate
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo(), const WeightsInfo &weights_info=WeightsInfo())
Static function to check if given info will lead to a valid configuration of CpuFullyConnected.
Definition: CpuFullyConnected.cpp:348
arm_compute::offset_int_vec
int offset_int_vec(int offset)
Definition: MemoryHelpers.h:38
arm_compute::DataType::F16
@ F16
16-bit floating-point number
arm_compute::DataType::S32
@ S32
signed 32-bit number
arm_compute::is_data_type_quantized_asymmetric
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: DataTypeUtils.h:346
ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
arm_compute::is_data_type_quantized
bool is_data_type_quantized(DataType dt)
Check if a given data type is of quantized type.
Definition: DataTypeUtils.h:324
arm_compute::ACL_SRC
@ ACL_SRC
Definition: Types.h:44
arm_compute::DataType::F32
@ F32
32-bit floating-point number
ARM_COMPUTE_LOG_PARAMS
#define ARM_COMPUTE_LOG_PARAMS(...)
Definition: Log.h:35
arm_compute::Dimensions< size_t >::num_max_dimensions
static constexpr size_t num_max_dimensions
Number of dimensions the tensor has.
Definition: Dimensions.h:46
arm_compute::cpu::CpuGemm::has_opt_impl
static Status has_opt_impl(arm_compute::WeightFormat &weight_format, const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const GEMMInfo &gemm_info=GEMMInfo())
Indicates whether or not there is an optimal assembly implementation that can be used to process the ...
Definition: CpuGemm.cpp:404