Basic function to compute a Fully Connected layer on OpenCL. More...

#include <ClFullyConnected.h>

Collaboration diagram for ClFullyConnected:

Public Member Functions
	ClFullyConnected ()

	~ClFullyConnected ()

void	configure (const CLCompileContext &compile_context, ITensorInfo src, ITensorInfo weights, ITensorInfo biases, ITensorInfo dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
	Set the input and output tensors. More...

void	run (ITensorPack &tensors) override
	Run the kernels contained in the function. More...

void	prepare (ITensorPack &tensors) override
	Prepare the function for executing. More...

experimental::MemoryRequirements	workspace () const override
	Return the memory requirements required by the workspace. More...

Public Member Functions inherited from ICLOperator
	ICLOperator (IRuntimeContext *ctx=nullptr)
	Constructor. More...

	ICLOperator (const ICLOperator &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	ICLOperator (ICLOperator &&)=default
	Default move constructor. More...

ICLOperator &	operator= (const ICLOperator &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

ICLOperator &	operator= (ICLOperator &&)=default
	Default move assignment operator. More...

Public Member Functions inherited from IOperator
virtual	~IOperator ()=default
	Destructor. More...

Static Public Member Functions
static Status	validate (const ITensorInfo src, const ITensorInfo weights, const ITensorInfo biases, const ITensorInfo dst, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
	Static function to check if given info will lead to a valid configuration. More...

Detailed Description

Basic function to compute a Fully Connected layer on OpenCL.

This function calls the following OpenCL kernels:

opencl::kernels::ClIm2ColKernel (called when the input comes from a convolutional layer)
CLTranspose (if are_weights_reshaped is set to false and transpose_weights is set to true ) (called once)
opencl::ClGemm or CLGEMMLowpMatrixMultiplyCore (if quantized asymmetric)

Note: The fully connected layer accepts "weights" tensors only with 2 dimensions.

Definition at line 61 of file ClFullyConnected.h.

Constructor & Destructor Documentation

◆ ClFullyConnected()

ClFullyConnected ( )

Definition at line 188 of file ClFullyConnected.cpp.

     : _convert_weights(nullptr),
       _flatten(nullptr),
       _reshape_weights(nullptr),
       _mm_gemm(nullptr),
       _mm_gemmlowp(nullptr),
       _matmul_native_kernel(nullptr),
       _matmul_lowp_native_kernel(nullptr),
       _aux_mem(Count)
 {
 }

◆ ~ClFullyConnected()

~ClFullyConnected ( )

default

Member Function Documentation

◆ configure()

void configure	(	const CLCompileContext &	compile_context,
		ITensorInfo *	src,
		ITensorInfo *	weights,
		ITensorInfo *	biases,
		ITensorInfo *	dst,
		FullyConnectedLayerInfo	fc_info = `FullyConnectedLayerInfo()`
	)

Set the input and output tensors.

Valid data layouts:

NHWC
NCHW

Valid data type configurations:

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED

Parameters

[in]	compile_context	The compile context to be used.
[in]	src	Source tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]	weights	Weights tensor. The weights must be 2 dimensional. If this function is called after a Convolution Layer, the (transposed) weights will have as many rows as the product of the first 3 input's dimensions. If it is called after another FullyConnected Layer, the (transposed) weights will have as many rows as the input's first dimension. Data type supported: Same as `src`.
[in]	biases	Bias tensor. Can be nullptr. Data type supported:Same as `src`.
[out]	dst	Destination tensor. Its shape should be equal to the output of a matrix multiplication between: The output of im2col on the input and the (transposed) 2D weights, if the function is called after a Convolution Layer The input tensor and the (transposed) 2D weights, if the function is called after another FullyConnected Layer. Data type supported: Same as `src`.
[in]	fc_info	(Optional) Fully connected layer additional info

Definition at line 330 of file ClFullyConnected.cpp.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(src, weights, dst);
     const GPUTarget gpu_target = get_arch_from_target(CLScheduler::get().target());
  
     // Perform validate step
     ARM_COMPUTE_ERROR_THROW_ON(ClFullyConnected::validate(src, weights, biases, dst, fc_info));
     ARM_COMPUTE_LOG_PARAMS(src, weights, biases, dst, fc_info);
  
     _transpose_weights  = fc_info.transpose_weights ? !fc_info.are_weights_reshaped : false;
     _is_fc_after_conv   = true;
     _is_quantized       = is_data_type_quantized_asymmetric(src->data_type());
     _is_prepared        = fc_info.retain_internal_weights;
     _weights_to_use     = TensorInfo(*weights);
     _weights_to_use_idx = ACL_SRC_1;
  
     // When using dynamic weights - use matmul kernels.
     // Note: MatMul is not used in the following cases (Gemm is used as fallback) :
     // 1. When the weights tensor is not dynamic
     // 2. MatMul does not support broadcasting batch dimension, and therefore is disabled if fc is batched.
     // 3. When FC is after convolution and src tensor data layout does not match weights trained data layout (weights conversion kernel is required)
     const bool is_batched_fc_layer = dst->dimension(1) > 1;
     _use_matmul = gpu_target != GPUTarget::MIDGARD && !weights->are_values_constant() && !is_batched_fc_layer &&
                   !(src->num_dimensions() > 1 && (src->data_layout() != fc_info.weights_trained_layout));
     _dynamic_gemm = !weights->are_values_constant() && _transpose_weights && !_use_matmul;
  
     // With the Fully Connected layer we can have 4 different cases:
     //  1) Convolution layer -> Fully Connected layer without batches
     //  2) Fully Connected layer -> Fully Connected layer without batches
     //  3) Convolution layer -> Fully Connected layer with batches
     //  4) Fully Connected layer -> Fully Connected layer with batches
  
     // Check if we have a fully connected layer with batches
     if (is_batched_fc_layer)
     {
         _is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) &&
                             (std::equal(src->tensor_shape().cbegin() + 3, src->tensor_shape().cend(),
                                         dst->tensor_shape().cbegin() + 1));
     }
     else
     {
         _is_fc_after_conv = src->num_dimensions() > 1;
     }
  
     ITensorInfo *weights_used = weights;
  
     // Reshape weights if needed - Not needed when matmul is in use as matmul fuses transpose op.
     if (_transpose_weights && !_use_matmul)
     {
         // Reshape the weights
         _reshape_weights = std::make_unique<ClTranspose>();
         _reshape_weights->configure(compile_context, weights, &_reshaped_weights);
         weights_used        = &_reshaped_weights;
         _weights_to_use_idx = offset_int_vec(TransposedWeights);
     }
  
     // Convert weights if needed
     if (_is_fc_after_conv && (src->data_layout() != fc_info.weights_trained_layout))
     {
         // Convert weights
         _convert_weights = std::make_unique<ClConvertFullyConnectedWeights>();
         _convert_weights->configure(compile_context, weights_used, &_converted_weights, src->tensor_shape(),
                                     fc_info.weights_trained_layout);
  
         weights_used         = &_converted_weights;
         _weights_to_use_idx  = offset_int_vec(ConvertedWeights);
         _run_convert_weights = true;
     }
  
     if (_is_fc_after_conv)
     {
         // Fully Connected layer after a Convolution Layer without batches
         configure_conv_fc(compile_context, src, weights_used, biases, dst, fc_info);
     }
     else
     {
         // Fully Connected layer after a Fully Connected Layer without batches
         configure_fc_fc(compile_context, src, weights_used, biases, dst, fc_info);
     }
     // Update TensorInfo of final weights used (Need to be done in the end due to padding expansion)
     _weights_to_use = *weights_used;
  
     if (_use_matmul)
     {
         // Note : MatMul does not use transpose and does not need auxillary memory, so only converted weights are added to aux_mem
         _aux_mem[ConvertedWeights] =
             MemoryInfo(offset_int_vec(ConvertedWeights), MemoryLifetime::Temporary, _converted_weights.total_size());
     }
     else
     {
         // Set auxiliary memory requirements for gemm operators
         auto gemm_mem_req = (_is_quantized) ? _mm_gemmlowp->workspace() : _mm_gemm->workspace();
         for (unsigned int i = 0; i < gemm_mem_req.size(); ++i)
         {
             _aux_mem[i] = gemm_mem_req[i];
         }
         if (_aux_mem[1].size > 0 || _aux_mem[2].size > 0) // Persistent weights memory on GEMMs
         {
             // Release permuted weights at the of prepare as they are further transposed by the assembly dispatch
             // Keep all the auxiliary tensors in case of dynamic weights as they are recalculated every time
             _aux_mem[TransposedWeights] = MemoryInfo(
                 offset_int_vec(TransposedWeights), _dynamic_gemm ? MemoryLifetime::Temporary : MemoryLifetime::Prepare,
                 _reshaped_weights.total_size());
             _aux_mem[ConvertedWeights] = MemoryInfo(offset_int_vec(ConvertedWeights),
                                                     _dynamic_gemm ? MemoryLifetime::Temporary : MemoryLifetime::Prepare,
                                                     _converted_weights.total_size());
         }
         else
         {
             // Release permuted weights at the of prepare as they are further transposed by the assembly dispatch
             const auto transposed_wei_lft = (_weights_to_use_idx == offset_int_vec(TransposedWeights))
                                                 ? MemoryLifetime::Persistent
                                                 : MemoryLifetime::Prepare;
             const auto converted_wei_lft  = (_weights_to_use_idx == offset_int_vec(ConvertedWeights))
                                                 ? MemoryLifetime::Persistent
                                                 : MemoryLifetime::Prepare;
  
             _aux_mem[TransposedWeights] = MemoryInfo(offset_int_vec(TransposedWeights),
                                                      _dynamic_gemm ? MemoryLifetime::Temporary : transposed_wei_lft,
                                                      _reshaped_weights.total_size());
             _aux_mem[ConvertedWeights]  = MemoryInfo(offset_int_vec(ConvertedWeights),
                                                     _dynamic_gemm ? MemoryLifetime::Temporary : converted_wei_lft,
                                                      _converted_weights.total_size());
         }
     }
     _aux_mem[FlattenedSrc] =
         MemoryInfo(offset_int_vec(FlattenedSrc), MemoryLifetime::Temporary, _flattened_src.total_size());
 }

References arm_compute::ACL_SRC_1, ITensorInfo::are_values_constant(), FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, arm_compute::test::validation::dst, CLScheduler::get(), arm_compute::get_arch_from_target(), arm_compute::is_data_type_quantized_asymmetric(), arm_compute::MIDGARD, Dimensions< size_t >::num_max_dimensions, arm_compute::offset_int_vec(), arm_compute::experimental::Prepare, FullyConnectedLayerInfo::retain_internal_weights, arm_compute::test::validation::src, TensorInfo::total_size(), FullyConnectedLayerInfo::transpose_weights, ClFullyConnected::validate(), and FullyConnectedLayerInfo::weights_trained_layout.

◆ prepare()

void prepare ( ITensorPack & constants )

overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Parameters

[in] constants Vector that contains the constants tensors.

Note: Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from ICLOperator.

Definition at line 634 of file ClFullyConnected.cpp.

 {
     // Note : Running prepare() each run when _use_matmul is true is unnecessary unless weights conversion is needed.
     if (!_is_prepared || _dynamic_gemm)
     {
 #ifdef ARM_COMPUTE_ASSERTS_ENABLED
         ++_asrt_prepare_count;
         ARM_COMPUTE_ERROR_ON(!_dynamic_gemm && !_use_matmul && _asrt_prepare_count > 1);
 #endif // ARM_COMPUTE_ASSERTS_ENABLED
  
         auto weights = tensors.get_const_tensor(ACL_SRC_1);
  
         CLAuxTensorHandler reshaped_weights(offset_int_vec(TransposedWeights), _reshaped_weights, tensors, false);
         CLAuxTensorHandler converted_weights(offset_int_vec(ConvertedWeights), _converted_weights, tensors, false);
  
         // Pointer to current weights
         const ITensor *cur_weights = weights;
  
         // Reshape weights if needed. Disabled when matmul kernels are enabled as matmul fuses transpose.
         if (_transpose_weights && !_use_matmul)
         {
             // Run reshape weights kernel and mark weights as unused
             ITensorPack transpose_pack{{ACL_SRC, weights}, {ACL_DST, reshaped_weights.get()}};
             _reshape_weights->run(transpose_pack);
  
             cur_weights->mark_as_unused();
             cur_weights = reshaped_weights.get();
         }
  
         // Convert weights if needed
         if (_run_convert_weights)
         {
             ITensorPack convert_pack{{ACL_SRC, cur_weights}, {ACL_DST, converted_weights.get()}};
             _convert_weights->run(convert_pack);
  
             cur_weights->mark_as_unused();
             cur_weights = converted_weights.get();
         }
  
         ITensorPack gemm_pack = tensors;
         gemm_pack.add_const_tensor(ACL_SRC_1, cur_weights);
  
         // Prepare GEMM prepare and release unused weights
         if (_dynamic_gemm || !_use_matmul)
         {
             if (!_is_quantized)
             {
                 _mm_gemm->prepare(gemm_pack);
             }
             else
             {
                 _mm_gemmlowp->prepare(gemm_pack);
             }
         }
  
         _is_prepared = true;
     }
 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_1, ITensorPack::add_const_tensor(), ARM_COMPUTE_ERROR_ON, CLAuxTensorHandler::get(), ITensorPack::get_const_tensor(), ITensor::mark_as_unused(), and arm_compute::offset_int_vec().

Referenced by ClFullyConnected::run().

◆ run()

void run ( ITensorPack & tensors )

overridevirtual

Run the kernels contained in the function.

Parameters

[in] tensors Vector that contains the tensors to operate on.

Reimplemented from ICLOperator.

Definition at line 579 of file ClFullyConnected.cpp.

 {
     prepare(tensors);
  
 #ifdef ARM_COMPUTE_ASSERTS_ENABLED
     ++_asrt_run_count;
     ARM_COMPUTE_ERROR_ON(_dynamic_gemm && _asrt_prepare_count != _asrt_run_count);
 #endif // ARM_COMPUTE_ASSERTS_ENABLED
  
     auto src = tensors.get_const_tensor(ACL_SRC_0);
  
     CLAuxTensorHandler flattened_src(offset_int_vec(FlattenedSrc), _flattened_src, tensors, false);
     CLAuxTensorHandler weights(_weights_to_use_idx, _weights_to_use, tensors, false);
  
     // Linearize input if it comes from a convolutional layer
     if (_is_fc_after_conv)
     {
         ITensorPack flatten_pack{{ACL_SRC, src}, {ACL_DST, flattened_src.get()}};
         _flatten->run(flatten_pack);
     }
  
     ITensorPack gemm_pack = tensors;
     gemm_pack.add_const_tensor(ACL_SRC_0, (_is_fc_after_conv) ? flattened_src.get() : src);
     if (_weights_to_use_idx != ACL_SRC_1)
     {
         gemm_pack.add_const_tensor(ACL_SRC_1, weights.get());
     }
  
     // Run MatMul Op
     if (_use_matmul)
     {
         // Run matmul kernels for matrix multiplication
         if (_is_quantized)
         {
             CLScheduler::get().enqueue_op(*_matmul_lowp_native_kernel, gemm_pack, true);
         }
         else
         {
             CLScheduler::get().enqueue_op(*_matmul_native_kernel, gemm_pack, true);
         }
     }
     else
     {
         // Run matrix multiply
         if (_is_quantized)
         {
             _mm_gemmlowp->run(gemm_pack);
         }
         else
         {
             _mm_gemm->run(gemm_pack);
         }
     }
 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, ITensorPack::add_const_tensor(), ARM_COMPUTE_ERROR_ON, CLScheduler::enqueue_op(), CLScheduler::get(), CLAuxTensorHandler::get(), ITensorPack::get_const_tensor(), arm_compute::offset_int_vec(), ClFullyConnected::prepare(), and arm_compute::test::validation::src.

◆ validate()

Status validate	(	const ITensorInfo *	src,
		const ITensorInfo *	weights,
		const ITensorInfo *	biases,
		const ITensorInfo *	dst,
		FullyConnectedLayerInfo	fc_info = `FullyConnectedLayerInfo()`
	)

static

Static function to check if given info will lead to a valid configuration.

Returns: a status

Definition at line 464 of file ClFullyConnected.cpp.

 {
     ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(src, weights, dst);
     ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(src, 1, DataType::QASYMM8, DataType::QASYMM8_SIGNED,
                                                          DataType::F16, DataType::F32);
     ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(src, weights, dst);
     ARM_COMPUTE_RETURN_ERROR_ON(weights->num_dimensions() > 2);
     ARM_COMPUTE_RETURN_ERROR_ON(
         fc_info.activation_info.enabled() && is_data_type_quantized(src->data_type()) &&
         fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::RELU &&
         fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::BOUNDED_RELU &&
         fc_info.activation_info.activation() != ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU);
     const GPUTarget gpu_target = get_arch_from_target(CLScheduler::get().target());
  
     const bool transpose_weights = fc_info.transpose_weights ? !fc_info.are_weights_reshaped : false;
     bool       is_fc_after_conv  = true;
  
     // When using dynamic weights - use matmul kernels.
     // Note: MatMul does not support broadcasting so fallback with batched cases.
     const bool is_batched_fc_layer = dst->dimension(1) > 1;
     const bool use_matmul          = gpu_target != GPUTarget::MIDGARD && !weights->are_values_constant() &&
                             !is_batched_fc_layer &&
                             !(src->num_dimensions() > 1 && (src->data_layout() != fc_info.weights_trained_layout));
  
     const ITensorInfo &flatten_src      = TensorInfo(src->clone()
                                                          ->set_is_resizable(true)
                                                          .reset_padding()
                                                          .set_tensor_shape(compute_flatten_shape(src))
                                                          .set_data_layout(DataLayout::NCHW));
     const ITensorInfo &reshaped_weights = TensorInfo(
         weights->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(compute_transposed_shape(*weights)));
     const ITensorInfo &converted_weights = (transpose_weights && !use_matmul)
                                                ? TensorInfo(*reshaped_weights.clone())
                                                : TensorInfo(weights->clone()->set_is_resizable(true).reset_padding());
  
     // With the Fully Connected layer we can have 4 different cases:
     //  1) Convolution layer -> Fully Connected layer without batches
     //  2) Fully Connected layer -> Fully Connected layer without batches
     //  3) Convolution layer -> Fully Connected layer with batches
     //  4) Fully Connected layer -> Fully Connected layer with batches
  
     const ITensorInfo *src_to_use     = src;
     const ITensorInfo *weights_to_use = weights;
  
     if (biases != nullptr)
     {
         ARM_COMPUTE_RETURN_ERROR_ON(biases->num_dimensions() > 1);
         if (is_data_type_quantized(src->data_type()))
         {
             ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(biases, 1, DataType::S32);
         }
         else
         {
             ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(src, biases);
         }
     }
  
     // Check if FC is after conv (flatten kernel is run in case where FC is after conv.)
     if (is_batched_fc_layer)
     {
         is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) &&
                            (std::equal(src->tensor_shape().cbegin() + 3, src->tensor_shape().cend(),
                                        dst->tensor_shape().cbegin() + 1));
     }
     else
     {
         is_fc_after_conv = src->num_dimensions() > 1;
     }
  
     // Transpose kernel does not run when matmul is supported as matmul fuses transpose op.
     if (transpose_weights && !use_matmul)
     {
         // Validate reshape weights kernel
         ARM_COMPUTE_RETURN_ON_ERROR(ClTranspose::validate(weights, &reshaped_weights));
         weights_to_use = &reshaped_weights;
     }
  
     if (is_fc_after_conv && (src->data_layout() != fc_info.weights_trained_layout))
     {
         // Validate convert weights kernel
         ARM_COMPUTE_RETURN_ON_ERROR(ClConvertFullyConnectedWeights::validate(
             weights_to_use, &converted_weights, src->tensor_shape(), fc_info.weights_trained_layout));
         weights_to_use = &converted_weights;
     }
  
     if (is_fc_after_conv)
     {
         // Fully Connected layer after a Convolution Layer without batches
         // K Index of matrix multiplication. MatMul performs transpose in kernel, so index is 0 when matmul and transpose enabled
         const int weight_idx = (use_matmul && transpose_weights) ? 0 : 1;
         ARM_COMPUTE_RETURN_ERROR_ON(
             (weights_to_use->dimension(weight_idx) != (src->dimension(0) * src->dimension(1) * src->dimension(2))));
  
         // Validate flatten kernel
         ARM_COMPUTE_RETURN_ON_ERROR(ClFlatten::validate(src, &flatten_src));
         src_to_use = &flatten_src;
     }
     else
     {
         // Fully Connected layer after a Fully Connected Layer without batches
         // K Index of matrix multiplication. MatMul performs transpose in kernel, so index is 0 when matmul and transpose enabled
         const int weight_idx = (use_matmul && transpose_weights) ? 0 : 1;
         ARM_COMPUTE_RETURN_ERROR_ON(src->dimension(0) != weights_to_use->dimension(weight_idx));
     }
  
     // Validate matrix multiply kernel
     ARM_COMPUTE_RETURN_ON_ERROR(validate_mm(*src_to_use, *weights_to_use, biases, *dst, fc_info, use_matmul));
  
     return Status{};
 }

References ActivationLayerInfo::activation(), FullyConnectedLayerInfo::activation_info, ITensorInfo::are_values_constant(), FullyConnectedLayerInfo::are_weights_reshaped, ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_flatten_shape(), arm_compute::misc::shape_calculator::compute_transposed_shape(), ITensorInfo::dimension(), arm_compute::test::validation::dst, ActivationLayerInfo::enabled(), arm_compute::F16, arm_compute::F32, CLScheduler::get(), arm_compute::get_arch_from_target(), arm_compute::is_data_type_quantized(), arm_compute::MIDGARD, arm_compute::NCHW, ITensorInfo::num_dimensions(), Dimensions< size_t >::num_max_dimensions, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::S32, arm_compute::test::validation::set_data_layout(), arm_compute::test::validation::src, FullyConnectedLayerInfo::transpose_weights, ClTranspose::validate(), ClConvertFullyConnectedWeights::validate(), ClFlatten::validate(), and FullyConnectedLayerInfo::weights_trained_layout.

Referenced by ClFullyConnected::configure(), and CLFullyConnectedLayer::validate().

◆ workspace()

experimental::MemoryRequirements workspace ( ) const

overridevirtual

Return the memory requirements required by the workspace.

Reimplemented from ICLOperator.

Definition at line 693 of file ClFullyConnected.cpp.

 {
     return _aux_mem;
 }

The documentation for this class was generated from the following files:

src/gpu/cl/operators/ClFullyConnected.h
src/gpu/cl/operators/ClFullyConnected.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ ClFullyConnected()

◆ ~ClFullyConnected()

Member Function Documentation

◆ configure()

◆ prepare()

◆ run()

◆ validate()

◆ workspace()