Compute Library
 22.08
CpuFullyConnected Class Reference

Basic function to compute a Fully Connected layer. More...

#include <CpuFullyConnected.h>

Collaboration diagram for CpuFullyConnected:
[legend]

Public Member Functions

 CpuFullyConnected ()
 Constructor. More...
 
 ~CpuFullyConnected ()
 Destructor. More...
 
void configure (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, ITensorInfo *dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo(), const WeightsInfo &weights_info=WeightsInfo())
 Set the input and output tensors. More...
 
void run (ITensorPack &tensors) override
 Run the kernels contained in the function. More...
 
void prepare (ITensorPack &tensors) override
 Prepare the function for executing. More...
 
experimental::MemoryRequirements workspace () const override
 Return the memory requirements required by the workspace. More...
 
- Public Member Functions inherited from INEOperator
 INEOperator (IRuntimeContext *ctx=nullptr)
 Constructor. More...
 
 INEOperator (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 INEOperator (INEOperator &&)=default
 Default move constructor. More...
 
INEOperatoroperator= (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
INEOperatoroperator= (INEOperator &&)=default
 Default move assignment operator. More...
 
 ~INEOperator ()
 Default destructor. More...
 
- Public Member Functions inherited from IOperator
virtual ~IOperator ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 Static function to check if given info will lead to a valid configuration of CpuFullyConnected. More...
 
static Status has_opt_impl (arm_compute::WeightFormat &expected_weight_format, const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, FullyConnectedLayerInfo fc_info, WeightsInfo weights_info)
 Static function that queries whether there exists fixed-format kernel and if it exists it will return in the first argument in what format weights are expected to be reshaped as defined by WeightFormat class. More...
 

Detailed Description

Basic function to compute a Fully Connected layer.

This function calls the following kernels:

  1. kernels::CpuIm2ColKernel (called when the input comes from a convolutional layer)
  2. kernels::CpuTransposeKernel (if are_weights_reshaped is set to false and transpose_weights is set to true ) (called once)
  3. CpuGemm or CpuGemmLowpMatrixMultiplyCore (if quantized asymmetric)
  4. kernels::CpuGemmMatrixAdditionKernel or CpuGemmLowpOutputStage (if quantized asymmetric) (if biases is not equal to nullptr)
Note
The fully connected layer accepts "weights" tensors only with 2 dimensions.

Definition at line 54 of file CpuFullyConnected.h.

Constructor & Destructor Documentation

◆ CpuFullyConnected()

Constructor.

Definition at line 148 of file CpuFullyConnected.cpp.

149  : _flatten(nullptr),
150  _convert_weights(nullptr),
151  _transpose_weights(nullptr),
152  _mm_gemm(nullptr),
153  _mm_gemmlowp(nullptr),
154  _flattened_src(),
155  _converted_weights(),
156  _reshaped_weights(),
157  _trans_weights(),
158  _trans_weights_idx(AuxTensorIdx::Count),
159  _aux_mem(Count),
160  _needs_weights_conversion(false),
161  _needs_weights_reshape(false),
162  _is_fc_after_conv(false),
163  _is_quantized_asymmetric(false),
164  _is_prepared(false),
165  _enable_fast_math(false),
166  _fixed_format(false),
168 {
169 }

◆ ~CpuFullyConnected()

~CpuFullyConnected ( )
default

Destructor.

Member Function Documentation

◆ configure()

void configure ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
ITensorInfo dst,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo(),
const WeightsInfo weights_info = WeightsInfo() 
)

Set the input and output tensors.

Valid data layouts:

  • NHWC
  • NCHW

Valid data type configurations:

src0 src1 src2 dst
F16 F16 F16 F16
F32 F32 F32 F32
QASYMM8 QASYMM8 S32 QASYMM8
QASYMM8_SIGNED QASYMM8_SIGNED S32 QASYMM8_SIGNED
Parameters
[in]srcSource tensor info. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]weightsWeights tensor info. The weights must be 2 dimensional. If this function is called after a Convolution Layer, the (transposed) weights will have as many rows as the product of the first 3 input's dimensions. If it is called after another FullyConnected Layer, the (transposed) weights will have as many rows as the input's first dimension. Data type supported: Same as src.
[in]biasesBias tensor info. Can be nullptr. Data type supported: Same as weights, S32 if weights is QASYMM8/QASYMM8_SIGNED.
[out]dstDestination tensor info. Its shape should be equal to the output of a matrix multiplication between:
  • The output of im2col on the input and the (transposed) 2D weights, if the function is called after a Convolution Layer
  • The input tensor and the (transposed) 2D weights, if the function is called after another FullyConnected Layer. Data type supported: Same as src.
[in]fc_info(Optional) Fully connected layer additional info
[in]weights_info(Optional) Stores neccessary compute information when weights are already reshaped

Definition at line 234 of file CpuFullyConnected.cpp.

References FullyConnectedLayerInfo::activation_info, ITensorInfo::are_values_constant(), FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, Dimensions< T >::cbegin(), Dimensions< T >::cend(), ITensorInfo::data_layout(), ITensorInfo::data_type(), ITensorInfo::dimension(), FullyConnectedLayerInfo::enable_fast_math, arm_compute::is_data_type_quantized_asymmetric(), ITensorInfo::num_dimensions(), Dimensions< size_t >::num_max_dimensions, arm_compute::offset_int_vec(), arm_compute::experimental::Prepare, FullyConnectedLayerInfo::retain_internal_weights, ITensorInfo::tensor_shape(), TensorInfo::total_size(), FullyConnectedLayerInfo::transpose_weights, arm_compute::UNSPECIFIED, CpuFullyConnected::validate(), WeightsInfo::weight_format(), and FullyConnectedLayerInfo::weights_trained_layout.

236 {
237  // Perform validate step
240  weights,
241  biases != nullptr ? biases : nullptr,
242  dst,
243  fc_info));
244  ARM_COMPUTE_LOG_PARAMS(src, weights, biases, dst, fc_info);
245 
246  _needs_weights_conversion = false;
247  _needs_weights_reshape = fc_info.transpose_weights ? !fc_info.are_weights_reshaped : false;
248  _needs_weights_reshape = _needs_weights_reshape && !fc_info.retain_internal_weights;
249  _is_fc_after_conv = true;
250  _is_quantized_asymmetric = is_data_type_quantized_asymmetric(src->data_type());
251  _is_prepared = false;
252  _trans_weights_idx = AuxTensorIdx::Count;
253  _enable_fast_math = fc_info.enable_fast_math;
254  _fixed_format = weights_info.weight_format() != WeightFormat::UNSPECIFIED;
255  _weight_format = weights_info.weight_format();
256 
257  // With the Fully Connected layer we can have 4 different cases:
258  // 1) Convolution layer -> Fully Connected layer without batches
259  // 2) Fully Connected layer -> Fully Connected layer without batches
260  // 3) Convolution layer -> Fully Connected layer with batches
261  // 4) Fully Connected layer -> Fully Connected layer with batches
262 
263  const ITensorInfo *weights_to_use = weights;
264 
265  // Check if we have a fully connected layer with batches
266  const bool is_batched_fc_layer = dst->dimension(1) > 1;
267  if(is_batched_fc_layer)
268  {
269  _is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) && (std::equal(src->tensor_shape().cbegin() + 3, src->tensor_shape().cend(), dst->tensor_shape().cbegin() + 1));
270  }
271  else
272  {
273  _is_fc_after_conv = src->num_dimensions() > 1;
274  }
275 
276  // Reshape weights if needed
277  if(_needs_weights_reshape)
278  {
279  // Reshape the weights
280  _transpose_weights = std::make_unique<kernels::CpuTransposeKernel>();
281  _transpose_weights->configure(weights, &_reshaped_weights);
282  weights_to_use = &_reshaped_weights;
283  _trans_weights_idx = AuxTensorIdx::TransposedWeights;
284  }
285 
286  // Convert weights if needed
287  if(_is_fc_after_conv && (src->data_layout() != fc_info.weights_trained_layout))
288  {
289  // Convert weights
290  _convert_weights = std::make_unique<CpuConvertFullyConnectedWeights>();
291  _convert_weights->configure(weights_to_use,
292  &_converted_weights,
293  src->tensor_shape(),
294  fc_info.weights_trained_layout);
295 
296  weights_to_use = &_converted_weights;
297  _needs_weights_conversion = true;
298  _trans_weights_idx = AuxTensorIdx::ConvertedWeights;
299  }
300 
301  if(_is_fc_after_conv)
302  {
303  // Fully Connected layer after a Convolution Layer without batches
304  configure_conv_fc(src, weights_to_use, biases, dst, fc_info.activation_info);
305  }
306  else
307  {
308  // Fully Connected layer after a Fully Connected Layer without batches
309  configure_fc_fc(src, weights_to_use, biases, dst, fc_info.activation_info);
310  }
311 
312  // Retain the tensorinfo with the weights to use
313  if(_needs_weights_reshape || _needs_weights_conversion)
314  {
315  _trans_weights = *weights_to_use;
316  }
317 
318  // Set auxiliary memory requirements
319  auto gemm_mem_req = (_is_quantized_asymmetric) ? _mm_gemmlowp->workspace() : _mm_gemm->workspace();
320  for(unsigned int i = 0; i < gemm_mem_req.size(); ++i)
321  {
322  _aux_mem[i] = gemm_mem_req[i];
323  }
324 
325  if(_aux_mem[Pretranspose].size > 0)
326  {
327  // Release permuted weights at the end of prepare as they are further transposed by the assembly dispatch
328  // Do not release them if biases are dynamic and data type is quantized, since the weights tensor will be used for biases offset calculation
329  _aux_mem[TransposedWeights] = MemoryInfo(offset_int_vec(TransposedWeights), (_is_quantized_asymmetric && biases
330  && !(biases->are_values_constant())) ? MemoryLifetime::Persistent : MemoryLifetime::Prepare,
331  _reshaped_weights.total_size());
332  _aux_mem[ConvertedWeights] = MemoryInfo(offset_int_vec(ConvertedWeights), MemoryLifetime::Prepare, _converted_weights.total_size());
333  }
334  else
335  {
336  _aux_mem[TransposedWeights] = MemoryInfo(offset_int_vec(TransposedWeights), _needs_weights_conversion ? MemoryLifetime::Prepare : MemoryLifetime::Persistent, _reshaped_weights.total_size());
337  _aux_mem[ConvertedWeights] = MemoryInfo(offset_int_vec(ConvertedWeights), MemoryLifetime::Persistent, _converted_weights.total_size());
338  }
339  _aux_mem[FlattenedSrc] = MemoryInfo(offset_int_vec(FlattenedSrc), MemoryLifetime::Temporary, _flattened_src.total_size());
340 }
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
SimpleTensor< float > src
Definition: DFT.cpp:155
size_t total_size() const override
Returns the total size of the tensor in bytes.
Definition: TensorInfo.h:250
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1052
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Static function to check if given info will lead to a valid configuration of CpuFullyConnected.
#define ARM_COMPUTE_LOG_PARAMS(...)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
int offset_int_vec(int offset)
Definition: MemoryHelpers.h:38
static constexpr size_t num_max_dimensions
Number of dimensions the tensor has.
Definition: Dimensions.h:46

◆ has_opt_impl()

Status has_opt_impl ( arm_compute::WeightFormat expected_weight_format,
const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo dst,
FullyConnectedLayerInfo  fc_info,
WeightsInfo  weights_info 
)
static

Static function that queries whether there exists fixed-format kernel and if it exists it will return in the first argument in what format weights are expected to be reshaped as defined by WeightFormat class.

Apart from the first argument the rest of the arguments are the same as in CpuFullyConnectedLayer::validate() except that all arguments are required.

Returns
a status

Definition at line 342 of file CpuFullyConnected.cpp.

References FullyConnectedLayerInfo::activation_info, FullyConnectedLayerInfo::enable_fast_math, arm_compute::test::validation::gemm_info, CpuGemm::has_opt_impl(), GEMMInfo::set_activation_info(), GEMMInfo::set_fast_math(), GEMMInfo::set_fixed_format(), GEMMInfo::set_weight_format(), arm_compute::UNSPECIFIED, and WeightsInfo::weight_format().

Referenced by NEFullyConnectedLayer::has_opt_impl().

344 {
345  GEMMInfo gemm_info(false, false, true /* Reshape weights only for the first run */);
346  gemm_info.set_activation_info(fc_info.activation_info);
347  gemm_info.set_fast_math(fc_info.enable_fast_math);
348  gemm_info.set_fixed_format(weights_info.weight_format() != WeightFormat::UNSPECIFIED);
349  gemm_info.set_weight_format(weights_info.weight_format());
350 
351  return CpuGemm::has_opt_impl(expected_weight_format, src, weights, biases, dst, gemm_info);
352 }
static Status has_opt_impl(arm_compute::WeightFormat &weight_format, const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const GEMMInfo &gemm_info=GEMMInfo())
Indicates whether or not there is an optimal assembly implementation that can be used to process the ...
Definition: CpuGemm.cpp:371
SimpleTensor< float > src
Definition: DFT.cpp:155

◆ prepare()

void prepare ( ITensorPack constants)
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Parameters
[in]constantsVector that contains the constants tensors.
Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from INEOperator.

Definition at line 478 of file CpuFullyConnected.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_1, ITensorPack::add_const_tensor(), Window::DimY, Scheduler::get(), CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), ITensor::mark_as_unused(), arm_compute::offset_int_vec(), and IScheduler::schedule_op().

Referenced by CpuFullyConnected::run().

479 {
480  if(!_is_prepared)
481  {
482  auto weights = tensors.get_const_tensor(ACL_SRC_1);
483 
484  CpuAuxTensorHandler reshaped_weights(offset_int_vec(TransposedWeights), _reshaped_weights, tensors, false);
485  CpuAuxTensorHandler converted_weights(offset_int_vec(ConvertedWeights), _converted_weights, tensors, false);
486 
487  // Pointer to current weights
488  const ITensor *cur_weights = weights;
489 
490  // Reshape of the weights (happens only once)
491  if(_needs_weights_reshape)
492  {
493  // Run reshape weights kernel and mark weights as unused
494  ITensorPack transpose_pack{ { ACL_SRC, weights }, { ACL_DST, reshaped_weights.get() } };
495  NEScheduler::get().schedule_op(_transpose_weights.get(), Window::DimY, _transpose_weights->window(), transpose_pack);
496 
497  cur_weights->mark_as_unused();
498  cur_weights = reshaped_weights.get();
499  }
500 
501  // Convert weights if needed (happens only once)
502  if(_needs_weights_conversion)
503  {
504  ITensorPack convert_pack{ { ACL_SRC, cur_weights }, { ACL_DST, converted_weights.get() } };
505  _convert_weights->run(convert_pack);
506 
507  cur_weights->mark_as_unused();
508  cur_weights = converted_weights.get();
509  }
510 
511  ITensorPack gemm_pack = tensors;
512  gemm_pack.add_const_tensor(ACL_SRC_1, cur_weights);
513 
514  // Prepare GEMM prepare and release unused weights
515  if(!_is_quantized_asymmetric)
516  {
517  _mm_gemm->prepare(gemm_pack);
518  }
519  else
520  {
521  _mm_gemmlowp->prepare(gemm_pack);
522  }
523 
524  _is_prepared = true;
525  }
526 }
virtual void schedule_op(ICPPKernel *kernel, const Hints &hints, const Window &window, ITensorPack &tensors)=0
Runs the kernel in the same thread as the caller synchronously.
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
int offset_int_vec(int offset)
Definition: MemoryHelpers.h:38
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:94

◆ run()

void run ( ITensorPack tensors)
overridevirtual

Run the kernels contained in the function.

Parameters
[in]tensorsVector that contains the tensors to operate on.

Reimplemented from INEOperator.

Definition at line 444 of file CpuFullyConnected.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, ITensorPack::add_const_tensor(), CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), arm_compute::offset_int_vec(), CpuFullyConnected::prepare(), and arm_compute::test::validation::src.

445 {
446  prepare(tensors);
447 
448  auto src = tensors.get_const_tensor(ACL_SRC_0);
449 
450  CpuAuxTensorHandler flattened_src(offset_int_vec(FlattenedSrc), _flattened_src, tensors, false);
451  CpuAuxTensorHandler transformed_wei(offset_int_vec(_trans_weights_idx), _trans_weights, tensors, false);
452 
453  // Linearize src if it comes from a convolutional layer
454  if(_is_fc_after_conv)
455  {
456  ITensorPack flatten_pack{ { ACL_SRC, src }, { ACL_DST, flattened_src.get() } };
457  _flatten->run(flatten_pack);
458  }
459 
460  ITensorPack gemm_pack = tensors;
461  gemm_pack.add_const_tensor(ACL_SRC_0, (_is_fc_after_conv) ? flattened_src.get() : src);
462  if(_needs_weights_reshape || _needs_weights_conversion)
463  {
464  gemm_pack.add_const_tensor(ACL_SRC_1, transformed_wei.get());
465  }
466 
467  // Run matrix multiply
468  if(_is_quantized_asymmetric)
469  {
470  _mm_gemmlowp->run(gemm_pack);
471  }
472  else
473  {
474  _mm_gemm->run(gemm_pack);
475  }
476 }
void prepare(ITensorPack &tensors) override
Prepare the function for executing.
SimpleTensor< float > src
Definition: DFT.cpp:155
int offset_int_vec(int offset)
Definition: MemoryHelpers.h:38

◆ validate()

Status validate ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo dst,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CpuFullyConnected.

Similar to CpuFullyConnected

Returns
a status

Definition at line 354 of file CpuFullyConnected.cpp.

References ActivationLayerInfo::activation(), FullyConnectedLayerInfo::activation_info, ITensorInfo::are_values_constant(), FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, ARM_COMPUTE_UNUSED, ActivationLayerInfo::BOUNDED_RELU, Dimensions< T >::cbegin(), Dimensions< T >::cend(), ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_flatten_shape(), arm_compute::misc::shape_calculator::compute_transposed_shape(), ITensorInfo::data_layout(), ITensorInfo::data_type(), ITensorInfo::dimension(), FullyConnectedLayerInfo::enable_fast_math, ActivationLayerInfo::enabled(), arm_compute::F16, arm_compute::F32, arm_compute::is_data_type_quantized(), ActivationLayerInfo::LU_BOUNDED_RELU, ITensorInfo::num_dimensions(), Dimensions< size_t >::num_max_dimensions, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, ActivationLayerInfo::RELU, FullyConnectedLayerInfo::retain_internal_weights, arm_compute::S32, arm_compute::test::validation::src, ITensorInfo::tensor_shape(), FullyConnectedLayerInfo::transpose_weights, CpuConvertFullyConnectedWeights::validate(), CpuTransposeKernel::validate(), CpuFlatten::validate(), and FullyConnectedLayerInfo::weights_trained_layout.

Referenced by CpuFullyConnected::configure(), and NEFullyConnectedLayer::validate().

356 {
357  ARM_COMPUTE_UNUSED(fc_info.retain_internal_weights);
361  ARM_COMPUTE_RETURN_ERROR_ON(weights->num_dimensions() > 2);
362  ARM_COMPUTE_RETURN_ERROR_ON(fc_info.activation_info.enabled() && is_data_type_quantized(src->data_type()) && fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::RELU
363  && fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::BOUNDED_RELU && fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU);
364  ARM_COMPUTE_RETURN_ERROR_ON(!weights->are_values_constant() && (!fc_info.are_weights_reshaped || fc_info.transpose_weights));
365 
366  bool weights_reshaped = fc_info.transpose_weights ? fc_info.are_weights_reshaped : true;
367  bool is_fc_after_conv = true;
368 
369  const ITensorInfo &flatten_src = TensorInfo(src->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(compute_flatten_shape(src)));
370  const ITensorInfo &reshaped_weights = TensorInfo(weights->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(compute_transposed_shape(*weights)));
371  const ITensorInfo &converted_weights = weights_reshaped ? TensorInfo(weights->clone()->set_is_resizable(true).reset_padding()) : TensorInfo(*reshaped_weights.clone());
372 
373  // With the Fully Connected layer we can have 4 different cases:
374  // 1) Convolution layer -> Fully Connected layer without batches
375  // 2) Fully Connected layer -> Fully Connected layer without batches
376  // 3) Convolution layer -> Fully Connected layer with batches
377  // 4) Fully Connected layer -> Fully Connected layer with batches
378 
379  const ITensorInfo *src_to_use = src;
380  const ITensorInfo *weights_to_use = weights;
381 
382  // Check if we have a fully connected layer with batches
383  const bool is_batched_fc_layer = dst->dimension(1) > 1;
384 
385  if(biases != nullptr)
386  {
387  ARM_COMPUTE_RETURN_ERROR_ON(biases->num_dimensions() > 1);
388  if(is_data_type_quantized(src->data_type()))
389  {
391  }
392  else
393  {
395  }
396  }
397 
398  if(is_batched_fc_layer)
399  {
400  is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) && (std::equal(src->tensor_shape().cbegin() + 3, src->tensor_shape().cend(), dst->tensor_shape().cbegin() + 1));
401  }
402  else
403  {
404  is_fc_after_conv = src->num_dimensions() > 1;
405  }
406 
407  if(!weights_reshaped)
408  {
409  // Validate reshape weights kernel
411  weights_to_use = &reshaped_weights;
412  }
413 
414  if(is_fc_after_conv && (src->data_layout() != fc_info.weights_trained_layout))
415  {
416  // Validate convert weights kernel
418  &converted_weights,
419  src->tensor_shape(),
420  fc_info.weights_trained_layout));
421  weights_to_use = &converted_weights;
422  }
423 
424  if(is_fc_after_conv)
425  {
426  // Fully Connected layer after a Convolution Layer without batches
427  ARM_COMPUTE_RETURN_ERROR_ON((weights_to_use->dimension(1) != (src->dimension(0) * src->dimension(1) * src->dimension(2))));
428 
429  // Validate flatten kernel
431  src_to_use = &flatten_src;
432  }
433  else
434  {
435  // Fully Connected layer after a Fully Connected Layer without batches
436  ARM_COMPUTE_RETURN_ERROR_ON(src->dimension(0) != weights_to_use->dimension(1));
437  }
438  // Validate matrix multiply kernel
439  ARM_COMPUTE_RETURN_ON_ERROR(validate_mm(src_to_use, weights_to_use, biases, dst, fc_info.activation_info, fc_info.enable_fast_math));
440 
441  return Status{};
442 }
bool is_data_type_quantized(DataType dt)
Check if a given data type is of quantized type.
Definition: Utils.h:1030
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
SimpleTensor< float > src
Definition: DFT.cpp:155
1 channel, 1 F16 per channel
static Status validate(const ITensorInfo *src, const ITensorInfo *dst)
Static function to check if given info will lead to a valid configuration.
Definition: CpuFlatten.cpp:42
TensorShape compute_transposed_shape(const ITensorInfo &input)
Calculate the transposed shape of a tensor.
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:159
1 channel, 1 S32 per channel
TensorShape compute_flatten_shape(const ITensorInfo *input)
Calculate the flattened output shape of a tensor.
static Status validate(const ITensorInfo *src, const ITensorInfo *dst, const TensorShape &original_src_shape, DataLayout data_layout)
Static function to check if given info will lead to a valid configuration.
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
static Status validate(const ITensorInfo *src, const ITensorInfo *dst)
Static function to check if given info will lead to a valid configuration.
quantized, asymmetric fixed-point 8-bit number unsigned
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:541
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:788
quantized, asymmetric fixed-point 8-bit number signed
static constexpr size_t num_max_dimensions
Number of dimensions the tensor has.
Definition: Dimensions.h:46

◆ workspace()

experimental::MemoryRequirements workspace ( ) const
overridevirtual

Return the memory requirements required by the workspace.

Reimplemented from INEOperator.

Definition at line 528 of file CpuFullyConnected.cpp.

529 {
530  return _aux_mem;
531 }

The documentation for this class was generated from the following files: