Basic function to compute the convolution layer. More...

#include <ClConv2d.h>

Collaboration diagram for ClConv2d:

Public Member Functions
	ClConv2d ()
	Default constructor. More...

	~ClConv2d ()
	Default Destructor. More...

	ClConv2d (const ClConv2d &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	ClConv2d (ClConv2d &&)=default
	Default move constructor. More...

ClConv2d &	operator= (const ClConv2d &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

ClConv2d &	operator= (ClConv2d &&)=default
	Default move assignment operator. More...

void	configure (const CLCompileContext &compile_context, ITensorInfo src, ITensorInfo weights, ITensorInfo biases, ITensorInfo dst, const Conv2dInfo &conv2d_info, const WeightsInfo &weights_info=WeightsInfo())
	Set the src and dst tensors. More...

void	run (ITensorPack &tensors) override
	Run the kernels contained in the function. More...

void	prepare (ITensorPack &tensors) override
	Prepare the function for executing. More...

experimental::MemoryRequirements	workspace () const override
	Return the memory requirements required by the workspace. More...

Public Member Functions inherited from ICLOperator
	ICLOperator (IRuntimeContext *ctx=nullptr)
	Constructor. More...

	ICLOperator (const ICLOperator &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	ICLOperator (ICLOperator &&)=default
	Default move constructor. More...

ICLOperator &	operator= (const ICLOperator &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

ICLOperator &	operator= (ICLOperator &&)=default
	Default move assignment operator. More...

Public Member Functions inherited from IOperator
virtual	~IOperator ()=default
	Destructor. More...

Static Public Member Functions
static Status	validate (const ITensorInfo src, const ITensorInfo weights, const ITensorInfo biases, const ITensorInfo dst, const Conv2dInfo &conv2d_info, const WeightsInfo &weights_info=WeightsInfo())
	Static function to check if given info will lead to a valid configuration of ClConv2d. More...

static ConvolutionMethod	get_convolution_method (const ITensorInfo src, const ITensorInfo weights, const ITensorInfo *dst, const Conv2dInfo &conv2d_info, const WeightsInfo &weights_info, const GPUTarget gpu_target)
	Static function to check if given info will return the convolution called by ClConv2d. More...

Detailed Description

Basic function to compute the convolution layer.

This function calls the following OpenCL kernels/functions:

The function selects one of the algorithms mentioned above based on:

The size of the kernel
Number of src/dst feature maps
Amount of memory needed

Generally GEMM-based convolution is executed when neither Winograd nor FFT nor Direct convolution can be performed.

FP32 Algorithm	Filter Size	Input/Output feature maps
Winograd	3x3 1x3 3x1 5x1 1x5 5x5(fast maths) 7x1 1x7	Input channels is greater than 3
FFT	Squared kernels and greater than 9x9	Input feature maps > Output feature maps
DirectConv	9x9
GEMM	Any size

Winograd 5x5 requires fast maths enabled.

FP16 Algorithm	Filter Size	Input/Output feature maps
Winograd	3x3 1x3 3x1 5x1 1x5 5x5	Input channels is greater than 3
FFT	Not supported
DirectConv	9x9
GEMM	Any size

Winograd FP16 requires fast maths enabled.

Definition at line 72 of file ClConv2d.h.

Constructor & Destructor Documentation

◆ ClConv2d() [1/3]

ClConv2d ( )

Default constructor.

Definition at line 74 of file ClConv2d.cpp.

                    : _operator()
 {
 }

◆ ~ClConv2d()

~ClConv2d ( )

default

Default Destructor.

◆ ClConv2d() [2/3]

ClConv2d ( const ClConv2d & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ ClConv2d() [3/3]

ClConv2d ( ClConv2d && )

default

Default move constructor.

Member Function Documentation

◆ configure()

void configure	(	const CLCompileContext &	compile_context,
		ITensorInfo *	src,
		ITensorInfo *	weights,
		ITensorInfo *	biases,
		ITensorInfo *	dst,
		const Conv2dInfo &	conv2d_info,
		const WeightsInfo &	weights_info = `WeightsInfo()`
	)

Set the src and dst tensors.

Valid data layouts:

NHWC
NCHW

Valid data type configurations:

src0	src1	src2	dst
F16	F16	F16	F16
F32	F32	F32	F32
QASYMM8	QASYMM8	S32	QASYMM8
QASYMM8	QSYMM8_PER_CHANNEL	S32	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED	S32	QASYMM8_SIGNED
QASYMM8_SIGNED	QSYMM8_PER_CHANNEL	S32	QASYMM8_SIGNED

Parameters

[in]	compile_context	The compile context to be used.
[in]	src	Source tensor info. 3 lower dimensions represent a single src [width, height, IFM], while every optional dimension from 4 and above represent a batch of srcs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]	weights	Weights tensor info. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as `src`, also could be QSYMM8_PER_CHANNEL if src is QASYMM8/QASYMM8_SIGNED.
[in]	biases	Biases tensor info. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Same as `src`, except for src of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[out]	dst	Destination tensor info. 3 lower dimensions represent a single dst [width, height, OFM], while the rest represent batch of dsts. Data types supported: Same as `src`.
[in]	conv2d_info	Contains convolution 2d info described in Conv2dInfo.
[in]	weights_info	Specifies if the weights tensor has been reshaped with CLWeightsReshapeKernel. Data type supported: Same as `src`.

Definition at line 80 of file ClConv2d.cpp.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(src, weights, dst);
     ARM_COMPUTE_ERROR_THROW_ON(
         ClConv2d::validate(src, weights, ((biases != nullptr) ? biases : nullptr), dst, conv2d_info, weights_info));
     ARM_COMPUTE_LOG_PARAMS(src, weights, biases, dst, conv2d_info, weights_info);
  
     switch (ClConv2d::get_convolution_method(src, weights, dst, conv2d_info, weights_info, CLScheduler::get().target()))
     {
         case ConvolutionMethod::WINOGRAD:
         {
             ARM_COMPUTE_ERROR_ON(conv2d_info.num_groups != 1);
             auto f = std::make_unique<ClWinogradConv2d>();
             f->configure(compile_context, src, weights, biases, dst, conv2d_info.conv_info, conv2d_info.act_info,
                          conv2d_info.enable_fast_math);
             _operator = std::move(f);
             break;
         }
         case ConvolutionMethod::DIRECT:
         {
             ARM_COMPUTE_ERROR_ON(conv2d_info.num_groups != 1);
             auto f = std::make_unique<ClDirectConv2d>();
             f->configure(compile_context, src, weights, biases, dst, conv2d_info.conv_info, conv2d_info.act_info);
             _operator = std::move(f);
             break;
         }
         case ConvolutionMethod::INDIRECT:
         {
             ARM_COMPUTE_ERROR_ON(conv2d_info.num_groups != 1);
             auto f = std::make_unique<ClIndirectConv2d>();
             f->configure(compile_context, src, weights, biases, dst, conv2d_info.conv_info, conv2d_info.act_info);
             _operator = std::move(f);
             break;
         }
         case ConvolutionMethod::GEMM:
         {
             auto f = std::make_unique<ClGemmConv2d>();
             f->configure(compile_context, src, weights, biases, dst, conv2d_info, weights_info);
             _operator = std::move(f);
             break;
         }
         default:
             ARM_COMPUTE_ERROR("Not supported.");
             break;
     }
     _aux_mem = _operator->workspace();
 }

References Conv2dInfo::act_info, ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, Conv2dInfo::conv_info, arm_compute::DIRECT, arm_compute::test::validation::dst, Conv2dInfo::enable_fast_math, arm_compute::GEMM, CLScheduler::get(), ClConv2d::get_convolution_method(), arm_compute::INDIRECT, Conv2dInfo::num_groups, arm_compute::test::validation::src, ClConv2d::validate(), arm_compute::test::validation::weights_info, and arm_compute::WINOGRAD.

◆ get_convolution_method()

ConvolutionMethod get_convolution_method	(	const ITensorInfo *	src,
		const ITensorInfo *	weights,
		const ITensorInfo *	dst,
		const Conv2dInfo &	conv2d_info,
		const WeightsInfo &	weights_info,
		const GPUTarget	gpu_target
	)

static

Static function to check if given info will return the convolution called by ClConv2d.

Parameters

[in]	src	Source tensor. 3 lower dimensions represent a single src [width, height, IFM], while every optional dimension from 4 and above represent a batch of srcs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32.
[in]	weights	Weights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as `src`, also could be QSYMM8_PER_CHANNEL if src is QASYMM8/QASYMM8_SIGNED.
[in]	dst	Destination tensor. 3 lower dimensions represent a single dst [width, height, OFM], while the rest represent batch of dsts. Data types supported: Same as `src`.
[in]	conv2d_info	Contains convolution 2d info described in Conv2dInfo.
[in]	weights_info	Specifies if the weights tensor has been reshaped with CLWeightsReshapeKernel.
[in]	gpu_target	Specifies the `GPUTarget`.

Returns: the Convolution Method Hint

Definition at line 190 of file ClConv2d.cpp.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(src);
     ARM_COMPUTE_ERROR_ON_NULLPTR(dst);
     ARM_COMPUTE_ERROR_ON_NULLPTR(weights);
     ARM_COMPUTE_UNUSED(weights_info);
  
     const PadStrideInfo       conv_info        = conv2d_info.conv_info;
     const ActivationLayerInfo act_info         = conv2d_info.act_info;
     const Size2D              dilation         = conv2d_info.dilation;
     bool                      enable_fast_math = conv2d_info.enable_fast_math;
  
     const size_t idx_w = get_data_layout_dimension_index(src->data_layout(), DataLayoutDimension::WIDTH);
     const size_t idx_h = get_data_layout_dimension_index(src->data_layout(), DataLayoutDimension::HEIGHT);
     const size_t idx_c = get_data_layout_dimension_index(src->data_layout(), DataLayoutDimension::CHANNEL);
  
     /* Input spatial dims, kernel size, IFM/OFM, conv info*/
     using ConvolutionConfiguration = std::tuple<Size2D, Size2D, Size2D, PadStrideInfo, DataLayout>;
     using ConfigurationMethod      = std::pair<ConvolutionConfiguration, ConvolutionMethod>;
  
     const std::vector<ConfigurationMethod> known_configs = {
         // Alexnet
         ConfigurationMethod(ConvolutionConfiguration(Size2D(27U, 27U), Size2D(5U, 5U), Size2D(48U, 128U),
                                                      PadStrideInfo(1U, 1U, 2U, 2U), DataLayout::NCHW),
                             ConvolutionMethod::DIRECT),
         // VGG16 / VGG19
         ConfigurationMethod(ConvolutionConfiguration(Size2D(224U, 224U), Size2D(3U, 3U), Size2D(3U, 64U),
                                                      PadStrideInfo(1U, 1U, 1U, 1U), DataLayout::NCHW),
                             ConvolutionMethod::DIRECT),
         // Mobilenet 224
         ConfigurationMethod(ConvolutionConfiguration(
                                 Size2D(224U, 224U), Size2D(3U, 3U), Size2D(3U, 32U),
                                 PadStrideInfo(2U, 2U, 0U, 1U, 0U, 1U, DimensionRoundingType::FLOOR), DataLayout::NCHW),
                             ConvolutionMethod::GEMM),
         // Mobilenet 160
         ConfigurationMethod(ConvolutionConfiguration(
                                 Size2D(160U, 160U), Size2D(3U, 3U), Size2D(3U, 24U),
                                 PadStrideInfo(2U, 2U, 0U, 1U, 0U, 1U, DimensionRoundingType::FLOOR), DataLayout::NCHW),
                             ConvolutionMethod::GEMM),
         // Mobilenet 224
         ConfigurationMethod(ConvolutionConfiguration(
                                 Size2D(224U, 224U), Size2D(3U, 3U), Size2D(3U, 32U),
                                 PadStrideInfo(2U, 2U, 0U, 1U, 0U, 1U, DimensionRoundingType::FLOOR), DataLayout::NHWC),
                             ConvolutionMethod::GEMM),
         // Mobilenet 160
         ConfigurationMethod(ConvolutionConfiguration(
                                 Size2D(160U, 160U), Size2D(3U, 3U), Size2D(3U, 24U),
                                 PadStrideInfo(2U, 2U, 0U, 1U, 0U, 1U, DimensionRoundingType::FLOOR), DataLayout::NHWC),
                             ConvolutionMethod::GEMM),
     };
  
     const auto find_config = [&](ConfigurationMethod c)
     {
         const ConvolutionConfiguration config      = c.first;
         const PadStrideInfo            info        = std::get<3>(config);
         const DataLayout               data_layout = std::get<4>(config);
  
         return std::get<0>(config) == Size2D(src->dimension(idx_w), src->dimension(idx_h)) &&
                std::get<1>(config) == Size2D(weights->dimension(idx_w), weights->dimension(idx_h)) &&
                std::get<2>(config) == Size2D(weights->dimension(idx_c), weights->dimension(3)) &&
                info.pad_top() == conv_info.pad_top() && info.pad_right() == conv_info.pad_right() &&
                info.pad_bottom() == conv_info.pad_bottom() && info.pad_left() == conv_info.pad_left() &&
                info.stride() == conv_info.stride() && (data_layout == src->data_layout());
     };
  
     std::vector<ConfigurationMethod>::const_iterator found;
     if ((found = std::find_if(known_configs.begin(), known_configs.end(), find_config)) != known_configs.end())
     {
         return (*found).second;
     }
  
     if (dilation != Size2D(1U, 1U))
     {
         return ConvolutionMethod::GEMM;
     }
     else
     {
         if (src->data_layout() == DataLayout::NCHW)
         {
             // SRGAN
             if ((src->dimension(idx_h) > 720U) && (dst->dimension(idx_h) > 720U) && (weights->dimension(idx_h) == 9) &&
                 (conv_info.pad_top() < 3) &&
                 (ClDirectConv2d::validate(src, weights, nullptr, dst, conv_info, act_info)))
             {
                 return ConvolutionMethod::DIRECT;
             }
             if ((weights->dimension(idx_h) > 5) && (src->dimension(idx_c) > dst->dimension(idx_c)) &&
                 (CLFFTConvolutionLayer::validate(src, weights, nullptr, dst, conv_info, act_info, enable_fast_math)))
             {
                 return ConvolutionMethod::FFT;
             }
             if (src->dimension(idx_c) < 16)
             {
                 return ConvolutionMethod::GEMM;
             }
             return bool(ClWinogradConv2d::validate(src, weights, nullptr, dst, conv_info, act_info, enable_fast_math))
                        ? ConvolutionMethod::WINOGRAD
                        : ConvolutionMethod::GEMM;
         }
         else
         {
             const bool is_direct_valid =
                 bool(ClDirectConv2d::validate(src, weights, nullptr, dst, conv_info, act_info));
             const bool is_wino_valid =
                 bool(ClWinogradConv2d::validate(src, weights, nullptr, dst, conv_info, act_info, enable_fast_math));
             const size_t kernel_sz_direct_conv_thr = get_direct_conv_kernel_threshold_nhwc(gpu_target);
  
             // SRGAN case
             if ((src->dimension(idx_h) > 720U) && (dst->dimension(idx_h) > 720U) && (weights->dimension(idx_h) == 9) &&
                 (conv_info.pad_top() < 3) && is_direct_valid)
             {
                 return ConvolutionMethod::DIRECT;
             }
  
             // Floating-point case: GeMM/Direct/Winograd
             if (is_data_type_float(src->data_type()))
             {
                 // Get dst shape
                 TensorShape output_shape =
                     misc::shape_calculator::compute_deep_convolution_shape(*src, *weights, conv_info);
                 const bool is_large_kernel_sz = (weights->dimension(idx_w) >= kernel_sz_direct_conv_thr) &&
                                                 (weights->dimension(idx_h) >= kernel_sz_direct_conv_thr);
                 const bool is_ifm_ge_8       = src->dimension(idx_c) >= 8;
                 const bool is_ifm_ge_16      = src->dimension(idx_c) >= 16;
                 const bool is_ofm_lte_8      = weights->dimension(3U) <= 8;
                 const bool is_ofm_lt_64      = weights->dimension(3U) < 64;
                 const bool workload_gte_8192 = (output_shape[0] * output_shape[1] * output_shape[2]) / 16 >= 8192;
                 const bool is_ifm_gt_ofm     = src->dimension(idx_c) > weights->dimension(3U);
                 const bool is_m_one          = output_shape[1] * output_shape[2] == 1;
                 const bool is_unit_stride =
                     (conv2d_info.conv_info.stride().first == 1) && (conv2d_info.conv_info.stride().second == 1);
                 const int32_t kernel_sz = weights->dimension(idx_w) * weights->dimension(idx_h);
  
                 // Run Winograd if valid and IFM >= 8
                 if (is_wino_valid && is_ifm_ge_8)
                 {
                     if (is_ofm_lte_8)
                     {
                         if (gpu_target == arm_compute::GPUTarget::G71 || gpu_target == arm_compute::GPUTarget::G72 ||
                             get_arch_from_target(gpu_target) == arm_compute::GPUTarget::MIDGARD)
                         {
                             return ConvolutionMethod::WINOGRAD;
                         }
                     }
                     else
                     {
                         return ConvolutionMethod::WINOGRAD;
                     }
                 }
  
                 // Direct convolution case
                 if (is_direct_valid)
                 {
                     if ((gpu_target == arm_compute::GPUTarget::G71 || gpu_target == arm_compute::GPUTarget::G72 ||
                          get_arch_from_target(gpu_target) == arm_compute::GPUTarget::MIDGARD))
                     {
                         if (is_large_kernel_sz && is_ifm_ge_16 && is_ifm_gt_ofm)
                         {
                             return ConvolutionMethod::DIRECT;
                         }
                     }
                     else if (gpu_target == arm_compute::GPUTarget::G76)
                     {
                         if ((is_large_kernel_sz && workload_gte_8192 && is_ifm_ge_16) || (is_ofm_lte_8 && is_ifm_ge_16))
                         {
                             return ConvolutionMethod::DIRECT;
                         }
                     }
                     else
                     {
                         ConvolutionMethod preferred_conv_method = ConvolutionMethod::DIRECT;
  
                         const bool is_indirect_valid =
                             bool(ClIndirectConv2d::validate(src, weights, nullptr, dst, conv_info, act_info));
  
                         // indirect conv2d should be called when:
                         // 1- When the kernel size is greater than 1x1 and less than or equal to 9x9 (81)
                         // 2- When the kernel size is odd
                         // 3- When the Gpu target is Arm Mali-G77
                         if (is_indirect_valid)
                         {
                             const bool is_kernel_sz_odd = kernel_sz % 2;
                             const bool is_g77           = gpu_target == GPUTarget::G77;
                             preferred_conv_method = (kernel_sz > 1) && (kernel_sz <= 81) && is_kernel_sz_odd && is_g77
                                                         ? ConvolutionMethod::INDIRECT
                                                         : ConvolutionMethod::DIRECT;
                         }
  
                         // Direct/indirect convolution used for the first layer of the network
                         if (workload_gte_8192 && !is_ifm_ge_16 && !is_unit_stride && is_ofm_lt_64)
                         {
                             // In general, the question we should ask for the first convolution layer of a model is:
                             // when the execution time of im2col + gemm < direct?. Since im2col does not depend on the OFM, it means that
                             // when OFM is big enough, the contribution of im2col is small and the GEMM approach is preferable.
                             // From internal experiments, the OFM threshold is 64 (is_ofm_lt_64)
                             return preferred_conv_method;
                         }
  
                         if ((is_large_kernel_sz || is_m_one) && workload_gte_8192 && is_ifm_ge_16)
                         {
                             return preferred_conv_method;
                         }
  
                         // Direct convolution used for the last layer of the network
                         if (is_ofm_lte_8)
                         {
                             return preferred_conv_method;
                         }
                     }
                 }
  
                 // Default case
                 return ConvolutionMethod::GEMM;
             }
  
             // Generic case for quantized. Only GeMM
             return ConvolutionMethod::GEMM;
         }
     }
 }

References Conv2dInfo::act_info, arm_compute::test::validation::act_info, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_UNUSED, arm_compute::CHANNEL, arm_compute::misc::shape_calculator::compute_deep_convolution_shape(), Conv2dInfo::conv_info, arm_compute::test::validation::conv_info, arm_compute::cpu::data_layout, Conv2dInfo::dilation, ITensorInfo::dimension(), arm_compute::DIRECT, arm_compute::test::validation::dst, Conv2dInfo::enable_fast_math, arm_compute::FFT, arm_compute::FLOOR, arm_compute::G71, arm_compute::G72, arm_compute::G76, arm_compute::G77, arm_compute::GEMM, arm_compute::get_arch_from_target(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::INDIRECT, arm_compute::test::validation::info, arm_compute::is_data_type_float(), arm_compute::MIDGARD, arm_compute::NCHW, arm_compute::NHWC, arm_compute::test::validation::output_shape, arm_compute::test::validation::src, PadStrideInfo::stride(), arm_compute::utils::cast::U, ClDirectConv2d::validate(), ClIndirectConv2d::validate(), ClWinogradConv2d::validate(), CLFFTConvolutionLayer::validate(), arm_compute::test::validation::weights_info, arm_compute::WIDTH, and arm_compute::WINOGRAD.

Referenced by ClConv2d::configure(), CLConvolutionLayer::configure(), CLConvolutionLayer::get_convolution_method(), ClConv2d::validate(), and CLConvolutionLayer::validate().

◆ operator=() [1/2]

ClConv2d& operator= ( ClConv2d && )

default

Default move assignment operator.

◆ operator=() [2/2]

ClConv2d& operator= ( const ClConv2d & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ prepare()

void prepare ( ITensorPack & constants )

overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Parameters

[in] constants Vector that contains the constants tensors.

Note: Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from ICLOperator.

Definition at line 422 of file ClConv2d.cpp.

 {
     _operator->prepare(tensors);
 }

Referenced by ClConv2d::run().

◆ run()

void run ( ITensorPack & tensors )

overridevirtual

Run the kernels contained in the function.

Parameters

[in] tensors Vector that contains the tensors to operate on.

Reimplemented from ICLOperator.

Definition at line 416 of file ClConv2d.cpp.

 {
     prepare(tensors);
     _operator->run(tensors);
 }

References ClConv2d::prepare().

◆ validate()

Status validate	(	const ITensorInfo *	src,
		const ITensorInfo *	weights,
		const ITensorInfo *	biases,
		const ITensorInfo *	dst,
		const Conv2dInfo &	conv2d_info,
		const WeightsInfo &	weights_info = `WeightsInfo()`
	)

static

Static function to check if given info will lead to a valid configuration of ClConv2d.

Similar to ClConv2d::configure()

Returns: a status

Definition at line 134 of file ClConv2d.cpp.

 {
     ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(src, weights, dst);
     ARM_COMPUTE_RETURN_ERROR_ON_MSG((conv2d_info.num_groups != 1) && (src->data_layout() != DataLayout::NCHW),
                                     "Grouping (num_groups != 1) with NHWC data layout is not supported");
  
     const GPUTarget gpu_target = CLScheduler::get().target();
  
     switch (ClConv2d::get_convolution_method(src, weights, dst, conv2d_info, weights_info, gpu_target))
     {
         case ConvolutionMethod::WINOGRAD:
         {
             //Validate Winograd
             ARM_COMPUTE_RETURN_ERROR_ON_MSG(conv2d_info.num_groups != 1,
                                             "Grouping (num_groups != 1) with ClWinogradConv2d is not supported");
             ARM_COMPUTE_RETURN_ON_ERROR(ClWinogradConv2d::validate(src, weights, biases, dst, conv2d_info.conv_info,
                                                                    conv2d_info.act_info, conv2d_info.enable_fast_math));
             break;
         }
         case ConvolutionMethod::DIRECT:
         {
             // Validate direct convolution layer
             ARM_COMPUTE_RETURN_ERROR_ON_MSG(conv2d_info.num_groups != 1,
                                             "Grouping (num_groups != 1) with ClDirectConv2d is not supported");
             ARM_COMPUTE_RETURN_ON_ERROR(
                 ClDirectConv2d::validate(src, weights, biases, dst, conv2d_info.conv_info, conv2d_info.act_info));
             break;
         }
         case ConvolutionMethod::INDIRECT:
         {
             // Validate indirect convolution layer
             ARM_COMPUTE_RETURN_ERROR_ON_MSG(conv2d_info.num_groups != 1,
                                             "Grouping (num_groups != 1) with ClIndirectConv2d is not supported");
             ARM_COMPUTE_RETURN_ON_ERROR(
                 ClIndirectConv2d::validate(src, weights, biases, dst, conv2d_info.conv_info, conv2d_info.act_info));
             break;
         }
         case ConvolutionMethod::GEMM:
         {
             // Validate gemm-based convolution layer
             ARM_COMPUTE_RETURN_ON_ERROR(ClGemmConv2d::validate(src, weights, biases, dst, conv2d_info, weights_info));
             break;
         }
         default:
             ARM_COMPUTE_ERROR("Not supported.");
             break;
     }
  
     return Status{};
 }

References Conv2dInfo::act_info, ARM_COMPUTE_ERROR, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, Conv2dInfo::conv_info, arm_compute::DIRECT, arm_compute::test::validation::dst, Conv2dInfo::enable_fast_math, arm_compute::GEMM, CLScheduler::get(), ClConv2d::get_convolution_method(), arm_compute::INDIRECT, arm_compute::NCHW, Conv2dInfo::num_groups, arm_compute::test::validation::src, CLScheduler::target(), ClDirectConv2d::validate(), ClIndirectConv2d::validate(), ClWinogradConv2d::validate(), ClGemmConv2d::validate(), arm_compute::test::validation::weights_info, and arm_compute::WINOGRAD.

Referenced by ClConv2d::configure(), and CLConvolutionLayer::validate().

◆ workspace()

experimental::MemoryRequirements workspace ( ) const

overridevirtual

Return the memory requirements required by the workspace.

Reimplemented from ICLOperator.

Definition at line 427 of file ClConv2d.cpp.

 {
     return _aux_mem;
 }

The documentation for this class was generated from the following files:

src/gpu/cl/operators/ClConv2d.h
src/gpu/cl/operators/ClConv2d.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ ClConv2d() [1/3]

◆ ~ClConv2d()

◆ ClConv2d() [2/3]

◆ ClConv2d() [3/3]

Member Function Documentation

◆ configure()

◆ get_convolution_method()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ prepare()

◆ run()

◆ validate()

◆ workspace()