Basic function to compute the convolution layer. More...

#include <NEGEMMConvolutionLayer.h>

Collaboration diagram for NEGEMMConvolutionLayer:

Public Member Functions
	NEGEMMConvolutionLayer (const std::shared_ptr< IMemoryManager > &memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
	Constructor. More...

	NEGEMMConvolutionLayer (const NEGEMMConvolutionLayer &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	NEGEMMConvolutionLayer (NEGEMMConvolutionLayer &&)=delete
	Prevent instances of this class from being moved (As this class contains non movable objects) More...

NEGEMMConvolutionLayer &	operator= (const NEGEMMConvolutionLayer &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

NEGEMMConvolutionLayer &	operator= (NEGEMMConvolutionLayer &&)=delete
	Prevent instances of this class from being moved (As this class contains non movable objects) More...

	~NEGEMMConvolutionLayer ()
	Default destructor. More...

void	configure (const ITensor input, const ITensor weights, const ITensor biases, ITensor output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), unsigned int num_groups=1)
	Set the input and output tensors. More...

void	run () override
	Run the kernels contained in the function. More...

void	prepare () override
	Prepare the function for executing. More...

Public Member Functions inherited from IFunction
virtual	~IFunction ()=default
	Destructor. More...

Static Public Member Functions
static Status	validate (const ITensorInfo input, const ITensorInfo weights, const ITensorInfo biases, const ITensorInfo output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), unsigned int num_groups=1)
	Static function to check if given info will lead to a valid configuration of NEGEMMConvolutionLayer. More...

Detailed Description

Basic function to compute the convolution layer.

This function calls the following Neon kernels/functions:

NEIm2ColKernel
NEGEMM (if the data type is BFLOAT16/FP16/FP32)
NEGEMMLowpMatrixMultiplyCore (if the data type is QASYMM8/QASYMM8_SIGNED)
NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint (if the data type is QASYMM8/QASYMM8_SIGNED)
NEArithmeticAddition (if biases != nullptr and we have a 1x1 convolution with the NHWC data layout)
NECol2ImKernel (if NCHW data layout)

Definition at line 163 of file NEGEMMConvolutionLayer.h.

Constructor & Destructor Documentation

◆ NEGEMMConvolutionLayer() [1/3]

NEGEMMConvolutionLayer	(	const std::shared_ptr< IMemoryManager > &	memory_manager = `nullptr`,
		IWeightsManager *	weights_manager = `nullptr`
	)

Constructor.

Definition at line 110 of file NEGEMMConvolutionLayer.cpp.

     : _memory_group(memory_manager), _weights_manager(weights_manager), _reshape_weights(), _reshape_weights_managed(), _im2col_kernel(), _mm_gemm(memory_manager), _mm_gemmlowp(memory_manager),
       _col2im_kernel(), _reshape_layer(), _original_weights(nullptr), _original_output(nullptr), _im2col_output(), _weights_reshaped(), _gemm_output(), _gemm_output_3d(), _tmp_output(),
       _data_layout(DataLayout::NCHW), _skip_im2col(false), _skip_col2im(false), _is_quantized(false), _is_prepared(false)
 {
 }

◆ NEGEMMConvolutionLayer() [2/3]

NEGEMMConvolutionLayer ( const NEGEMMConvolutionLayer & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEGEMMConvolutionLayer() [3/3]

NEGEMMConvolutionLayer ( NEGEMMConvolutionLayer && )

delete

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ ~NEGEMMConvolutionLayer()

~NEGEMMConvolutionLayer ( )

default

Default destructor.

Referenced by NEConvolutionLayerReshapeWeights::run().

Member Function Documentation

◆ configure()

void configure	(	const ITensor *	input,
		const ITensor *	weights,
		const ITensor *	biases,
		ITensor *	output,
		const PadStrideInfo &	conv_info,
		const WeightsInfo &	weights_info = `WeightsInfo()`,
		const Size2D &	dilation = `Size2D(1U, 1U)`,
		const ActivationLayerInfo &	act_info = `ActivationLayerInfo()`,
		unsigned int	num_groups = `1`
	)

Set the input and output tensors.

Parameters

[in]	input	Source tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/BFLOAT16/F16/F32.
[in]	weights	Weights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL/BFLOAT16/F16/F32.
[in]	biases	Biases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match `input` data type, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[out]	output	Destination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as `input`.
[in]	conv_info	Contains padding and stride information described in PadStrideInfo.
[in]	weights_info	Specifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with NEGEMMTranspose1xWKernel. Data type supported: Same as `input`.
[in]	dilation	(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]	act_info	(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU supported.
[in]	num_groups	(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported

Definition at line 258 of file NEGEMMConvolutionLayer.cpp.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
     ARM_COMPUTE_UNUSED(num_groups, weights_info);
     ARM_COMPUTE_ERROR_THROW_ON(NEGEMMConvolutionLayer::validate(input->info(),
                                                                 weights->info(),
                                                                 biases != nullptr ? biases->info() : nullptr,
                                                                 output->info(),
                                                                 conv_info,
                                                                 weights_info,
                                                                 dilation,
                                                                 act_info,
                                                                 num_groups));
 
     const DataType   data_type   = input->info()->data_type();
     const DataLayout data_layout = input->info()->data_layout();
     const int        idx_width   = get_data_layout_dimension_index(data_layout, DataLayoutDimension::WIDTH);
     const int        idx_height  = get_data_layout_dimension_index(data_layout, DataLayoutDimension::HEIGHT);
     const int        idx_kernels = get_data_layout_dimension_index(data_layout, DataLayoutDimension::BATCHES);
 
     const unsigned int kernel_width  = weights->info()->dimension(idx_width);
     const unsigned int kernel_height = weights->info()->dimension(idx_height);
 
     _is_prepared      = weights_info.retain_internal_weights();
     _original_weights = weights;
     _original_output  = output;
     _is_quantized     = is_data_type_quantized_asymmetric(input->info()->data_type());
     _data_layout      = data_layout;
     _skip_im2col      = (data_layout == DataLayout::NHWC && kernel_width == 1 && kernel_height == 1 && conv_info.stride().first == 1 && conv_info.stride().second == 1);
 
     const ITensor *gemm_input_to_use  = input;
     ITensor       *gemm_output_to_use = output;
 
     // Get convolved dimensions
     unsigned int conv_w = 0;
     unsigned int conv_h = 0;
     std::tie(conv_w, conv_h) = scaled_dimensions(input->info()->dimension(idx_width),
                                                  input->info()->dimension(idx_height),
                                                  kernel_width,
                                                  kernel_height,
                                                  conv_info,
                                                  dilation);
 
     // Check if GEMM3D is supported
     if(data_layout == DataLayout::NHWC)
     {
         _skip_col2im = bool(validate_gemm3d(input->info(), weights->info(), act_info, conv_h, true));
         // If not supported, we need to perform im2col and col2im (or reshape layer)
         if(!_skip_col2im)
         {
             _skip_im2col = false;
         }
     }
     else
     {
         _skip_col2im = false;
     }
 
     // Get parameters from conv_info
     unsigned int stride_x = 0;
     unsigned int stride_y = 0;
     std::tie(stride_x, stride_y) = conv_info.stride();
 
     unsigned int mat_weights_cols = weights->info()->dimension(idx_kernels);
 
     // _weights_reshaped will be auto configured in the kernel.
     // Just append biases and do not transpose 1xW as it will be reshaped in NEGEMM
     const ITensor *weights_to_use = weights;
 
     if(_weights_manager && _weights_manager->are_weights_managed(weights))
     {
         _reshape_weights_managed.configure(weights, nullptr);
         weights_to_use = _weights_manager->acquire(weights, &_reshape_weights_managed);
     }
     else
     {
         _reshape_weights.configure(weights, nullptr, &_weights_reshaped);
         weights_to_use = &_weights_reshaped;
     }
 
     // Create tensor to store im2col reshaped inputs
     if(!_skip_im2col)
     {
         _memory_group.manage(&_im2col_output);
 
         // Configure
         _im2col_kernel = std::make_unique<NEIm2ColKernel>();
         _im2col_kernel->configure(input, &_im2col_output, Size2D(kernel_width, kernel_height), conv_info, false, dilation);
 
         // Update GEMM input
         gemm_input_to_use = &_im2col_output;
     }
 
     // Create temporary GEMM output tensor in case we cannot skip col2im
     const DataType output_data_type = data_type == DataType::BFLOAT16 ? DataType::F32 : data_type;
     if(!_skip_col2im)
     {
         TensorShape shape_gemm;
 
         // Calculate GEMM output shape
         shape_gemm = _im2col_output.info()->tensor_shape();
         shape_gemm.set(0, mat_weights_cols);
         shape_gemm.set(1, conv_w * conv_h);
 
         // FIXME: input->clone() doesn't work with subtensors for grouped convolutions.
         TensorInfo info_gemm(shape_gemm, 1, output_data_type);
         info_gemm.set_quantization_info(output->info()->quantization_info()).set_data_layout(input->info()->data_layout());
         _gemm_output.allocator()->init(info_gemm);
         _gemm_output_3d.allocator()->init(info_gemm);
         _memory_group.manage(&_gemm_output);
 
         // Update GEMM output
         gemm_output_to_use = &_gemm_output;
     }
     else
     {
         TensorInfo out_info{ *output->info() };
         out_info.set_data_type(output_data_type).set_data_layout(input->info()->data_layout());
         _gemm_output.allocator()->init(out_info);
         _gemm_output_3d.allocator()->init(out_info);
         _memory_group.manage(&_gemm_output);
 
         // Update GEMM output
         gemm_output_to_use = &_gemm_output_3d;
     }
 
     // Configure GEMM
     // In case we need to skip col2im, GEMM3D (gemm_3d_depth != 0) must be called in order to avoid reshaping the output matrix
     const unsigned int gemm_3d_depth = _skip_col2im ? conv_h : 0;
     configure_mm(gemm_input_to_use, weights_to_use, biases, gemm_output_to_use, act_info, gemm_3d_depth);
 
     if(!_skip_im2col)
     {
         _im2col_output.allocator()->allocate();
     }
 
     if(!_skip_col2im)
     {
         if(_data_layout == DataLayout::NCHW)
         {
             // Configure col2im
             _col2im_kernel = std::make_unique<NECol2ImKernel>();
             _col2im_kernel->configure(gemm_output_to_use, output, Size2D(conv_w, conv_h));
         }
         else
         {
             // Configure reshape layer
             _reshape_layer.configure(gemm_output_to_use, output);
         }
     }
     else
     {
         // Configure reshape layer
         _reshape_layer.configure(gemm_output_to_use, output);
     }
 
     if(_is_quantized && !_skip_col2im)
     {
         _tmp_output.allocator()->allocate();
     }
 
     _gemm_output.allocator()->allocate();
 
     ARM_COMPUTE_ERROR_ON_MSG((output->info()->dimension(idx_width) != conv_w) || (output->info()->dimension(idx_height) != conv_h),
                              "Output shape does not match the expected one");
 }

◆ operator=() [1/2]

NEGEMMConvolutionLayer& operator= ( const NEGEMMConvolutionLayer & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEGEMMConvolutionLayer& operator= ( NEGEMMConvolutionLayer && )

delete

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ prepare()

void prepare ( )

overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note: Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 618 of file NEGEMMConvolutionLayer.cpp.

References TensorAllocator::allocate(), Tensor::allocator(), IWeightsManager::are_weights_managed(), TensorAllocator::free(), ITensor::is_used(), ITensor::mark_as_unused(), NEGEMM::prepare(), NEGEMMLowpMatrixMultiplyCore::prepare(), IWeightsManager::run(), and NEConvolutionLayerReshapeWeights::run().

Referenced by NEGEMMConvolutionLayer::run().

 {
     if(!_is_prepared)
     {
         if(_weights_manager && _weights_manager->are_weights_managed(_original_weights))
         {
             _weights_manager->run(_original_weights, &_reshape_weights_managed);
         }
         else
         {
             // Run weights reshaping and mark original weights tensor as unused
             _weights_reshaped.allocator()->allocate();
             _reshape_weights.run();
             _original_weights->mark_as_unused();
         }
 
         // Prepare GEMM
         _is_quantized ? _mm_gemmlowp.prepare() : _mm_gemm.prepare();
         if(!_weights_reshaped.is_used())
         {
             _weights_reshaped.allocator()->free();
         }
 
         _is_prepared = true;
     }
 }

◆ run()

void run ( )

overridevirtual

Run the kernels contained in the function.

For Neon kernels:

Multi-threading is used for the kernels which are parallelisable.
By default std::thread::hardware_concurrency() threads are used.

Note: CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

All the kernels are enqueued on the queue associated with CLScheduler.
The queue is then flushed.

Note: The function will not block until the kernels are executed. It is the user's responsibility to wait.; Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 566 of file NEGEMMConvolutionLayer.cpp.

References Tensor::allocator(), BorderSize::bottom, ITensor::buffer(), Window::DimY, ITensorInfo::extend_padding(), TensorAllocator::free(), Scheduler::get(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, TensorAllocator::import_memory(), ITensor::info(), Tensor::info(), arm_compute::NCHW, ITensorInfo::padding(), NEGEMMConvolutionLayer::prepare(), NEReshapeLayer::run(), NEGEMM::run(), NEGEMMLowpMatrixMultiplyCore::run(), IScheduler::schedule(), and BorderSize::top.

 {
     prepare();
 
     MemoryGroupResourceScope scope_mg(_memory_group);
 
     bool out_has_padding = _skip_col2im && (_original_output->info()->padding().bottom != 0 || _original_output->info()->padding().top != 0);
 
     if(!_skip_im2col)
     {
         // Run input reshaping
         unsigned int y_dim = get_data_layout_dimension_index(_data_layout, DataLayoutDimension::HEIGHT);
         NEScheduler::get().schedule(_im2col_kernel.get(), y_dim);
     }
 
     // Handle the case where output has top/bottom padding
     const ITensor *out_to_use = out_has_padding ? &_gemm_output : _original_output;
     _gemm_output_3d.info()->extend_padding(out_to_use->info()->padding());
     _gemm_output_3d.allocator()->import_memory(out_to_use->buffer());
 
     // Runs NEGEMM or NEGEMMLowpMatrixMultiplyCore functions
     if(_is_quantized)
     {
         // Run gemmlowp
         _mm_gemmlowp.run();
     }
     else
     {
         // Run gemm
         _mm_gemm.run();
     }
 
     // Reshape output matrix
     if(!_skip_col2im)
     {
         if(_data_layout == DataLayout::NCHW)
         {
             NEScheduler::get().schedule(_col2im_kernel.get(), Window::DimY);
         }
         else
         {
             _reshape_layer.run();
         }
     }
     else if(out_has_padding)
     {
         _reshape_layer.run();
     }
 
     _gemm_output_3d.allocator()->free();
 }

◆ validate()

Status validate	(	const ITensorInfo *	input,
		const ITensorInfo *	weights,
		const ITensorInfo *	biases,
		const ITensorInfo *	output,
		const PadStrideInfo &	conv_info,
		const WeightsInfo &	weights_info = `WeightsInfo()`,
		const Size2D &	dilation = `Size2D(1U, 1U)`,
		const ActivationLayerInfo &	act_info = `ActivationLayerInfo()`,
		unsigned int	num_groups = `1`
	)

static

Static function to check if given info will lead to a valid configuration of NEGEMMConvolutionLayer.

Parameters

[in]	input	Source tensor info. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/BFLOAT16/F16/F32.
[in]	weights	Weights tensor info. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL/BFLOAT16/F16/F32.
[in]	biases	Biases tensor info. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match `input` data type, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[in]	output	Destination tensor info. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as `input`.
[in]	conv_info	Contains padding and stride information described in PadStrideInfo.
[in]	weights_info	Specifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with NEGEMMTranspose1xWKernel. Data type supported: Same as `input`.
[in]	dilation	(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]	act_info	(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU supported.
[in]	num_groups	(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported

Returns: a status

Definition at line 426 of file NEGEMMConvolutionLayer.cpp.

References WeightsInfo::are_reshaped(), ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::BATCHES, arm_compute::BFLOAT16, arm_compute::CHANNEL, arm_compute::misc::shape_calculator::compute_weights_reshaped_shape(), arm_compute::test::validation::conv_info, arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), arm_compute::test::validation::data_type, ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, arm_compute::test::validation::input, arm_compute::is_data_type_quantized_asymmetric(), arm_compute::NCHW, arm_compute::NHWC, ITensorInfo::num_dimensions(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8_PER_CHANNEL, ITensorInfo::quantization_info(), arm_compute::S32, arm_compute::scaled_dimensions(), TensorShape::set(), arm_compute::test::validation::set_data_layout(), TensorInfo::set_quantization_info(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), NEConvolutionLayerReshapeWeights::validate(), NECol2ImKernel::validate(), NEIm2ColKernel::validate(), and arm_compute::WIDTH.

Referenced by NEGEMMConvolutionLayer::configure(), and NEConvolutionLayer::validate().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(input, weights, output);
     ARM_COMPUTE_RETURN_ERROR_ON_MSG(weights_info.are_reshaped(), "Weights already reshaped are not supported!");
     ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(input, 1, DataType::QASYMM8, DataType::QASYMM8_SIGNED, DataType::BFLOAT16, DataType::F16, DataType::F32);
     ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(weights, 1, DataType::QASYMM8, DataType::QASYMM8_SIGNED, DataType::QSYMM8_PER_CHANNEL, DataType::BFLOAT16, DataType::F16, DataType::F32);
     ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(input, weights);
     ARM_COMPUTE_RETURN_ERROR_ON_MSG(num_groups > 1, "Grouping (num_groups != 1) is not supported on Neon");
 
     const DataLayout data_layout = input->data_layout();
     const DataType   data_type   = input->data_type();
     const int        idx_width   = get_data_layout_dimension_index(data_layout, DataLayoutDimension::WIDTH);
     const int        idx_height  = get_data_layout_dimension_index(data_layout, DataLayoutDimension::HEIGHT);
     const int        idx_channel = get_data_layout_dimension_index(data_layout, DataLayoutDimension::CHANNEL);
     const int        idx_kernels = get_data_layout_dimension_index(data_layout, DataLayoutDimension::BATCHES);
 
     const unsigned int kernel_width  = weights->dimension(idx_width);
     const unsigned int kernel_height = weights->dimension(idx_height);
 
     TensorInfo         im2col_reshaped_info{};
     TensorInfo         info_gemm{};
     TensorInfo         tmp_info{};
     TensorInfo         weights_reshaped_info{};
     const ITensorInfo *gemm_input_to_use  = input;
     const ITensorInfo *gemm_output_to_use = output;
     const ITensorInfo *weights_to_use     = weights;
 
     const bool append_bias  = false;
     const bool is_quantized = is_data_type_quantized_asymmetric(data_type);
     const bool is_bf16      = data_type == DataType::BFLOAT16;
     bool       skip_im2col  = (data_layout == DataLayout::NHWC && kernel_width == 1 && kernel_height == 1 && conv_info.stride().first == 1 && conv_info.stride().second == 1);
 
     // Get convolved dimensions
     unsigned int conv_w = 0;
     unsigned int conv_h = 0;
 
     std::tie(conv_w, conv_h) = scaled_dimensions(input->dimension(idx_width),
                                                  input->dimension(idx_height),
                                                  kernel_width,
                                                  kernel_height,
                                                  conv_info,
                                                  dilation);
 
     // Check if GEMM3D is supported
     bool skip_col2im = false;
     if(data_layout == DataLayout::NHWC)
     {
         skip_col2im = bool(validate_gemm3d(input, weights, act_info, conv_h, true));
         // If not supported, we need to perform im2col and col2im (or reshape layer)
         if(!skip_col2im)
         {
             skip_im2col = false;
         }
     }
 
     if(skip_col2im)
     {
         // If not supported, we need to perform im2col and col2im (or reshape layer)
         if(!bool(validate_gemm3d(input, weights, act_info, conv_h, skip_im2col)))
         {
             skip_im2col = false;
             skip_col2im = false;
         }
     }
 
     ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_channel) != input->dimension(idx_channel));
     ARM_COMPUTE_RETURN_ERROR_ON(weights->num_dimensions() > 4);
 
     // Validate biases
     if(biases != nullptr)
     {
         if(is_quantized)
         {
             ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(biases, 1, DataType::S32);
         }
         else if(is_bf16)
         {
             ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(biases, 1, DataType::F32);
         }
         else
         {
             ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, biases);
         }
         ARM_COMPUTE_RETURN_ERROR_ON(biases->dimension(0) != weights->dimension(idx_kernels));
         ARM_COMPUTE_RETURN_ERROR_ON(biases->num_dimensions() > 1);
     }
 
     unsigned int mat_weights_cols = weights->dimension(idx_kernels);
     unsigned int mat_weights_rows = weights->dimension(idx_width) * weights->dimension(idx_height) * weights->dimension(idx_channel);
 
     // Output tensor auto inizialization if not yet initialized
     ARM_COMPUTE_RETURN_ON_ERROR(NEConvolutionLayerReshapeWeights::validate(weights, nullptr, nullptr));
     weights_reshaped_info = TensorInfo(compute_weights_reshaped_shape(*weights, append_bias), 1, data_type);
     weights_reshaped_info.set_quantization_info(weights->quantization_info());
     weights_to_use = &weights_reshaped_info;
 
     if(!skip_im2col)
     {
         // Create tensor info for im2col reshaped inputs
         // For Neon the batch size is on the fourth dimension
         // TODO (giaiod01): Auto-initialize the output shape of im2col COMPMID-1482
         TensorShape shape_im2col = input->tensor_shape();
         shape_im2col.set(0, mat_weights_rows);
         shape_im2col.set(1, conv_w * conv_h);
         shape_im2col.set(2, 1);
 
         im2col_reshaped_info = TensorInfo(shape_im2col, 1, data_type);
         im2col_reshaped_info.set_quantization_info(input->quantization_info());
 
         ARM_COMPUTE_RETURN_ON_ERROR(NEIm2ColKernel::validate(input, &im2col_reshaped_info, Size2D(kernel_width, kernel_height), conv_info, append_bias, dilation));
         gemm_input_to_use = &im2col_reshaped_info;
     }
 
     // Create temporary GEMM output tensor in case we cannot skip col2im
     const DataType output_data_type = data_type == DataType::BFLOAT16 ? DataType::F32 : data_type;
     if(!skip_col2im)
     {
         TensorShape shape_gemm = gemm_input_to_use->tensor_shape();
         shape_gemm.set(0, mat_weights_cols);
         shape_gemm.set(1, conv_w * conv_h);
         info_gemm = TensorInfo(shape_gemm, 1, output_data_type);
     }
     else
     {
         info_gemm = TensorInfo(output->tensor_shape(), 1, output_data_type);
     }
     info_gemm.set_quantization_info(output->quantization_info()).set_data_layout(input->data_layout());
     gemm_output_to_use = &info_gemm;
     ARM_COMPUTE_RETURN_ON_ERROR(validate_mm(gemm_input_to_use, weights_to_use, biases, gemm_output_to_use, act_info, skip_col2im ? conv_h : 0, skip_im2col));
 
     // Validate Col2Im/ReshapeLayer
     if(!skip_col2im && (data_layout == DataLayout::NCHW))
     {
         ARM_COMPUTE_RETURN_ON_ERROR(NECol2ImKernel::validate(gemm_output_to_use, output, Size2D(conv_w, conv_h)));
     }
 
     return Status{};
 }

The documentation for this class was generated from the following files:

arm_compute/runtime/NEON/functions/NEGEMMConvolutionLayer.h
src/runtime/NEON/functions/NEGEMMConvolutionLayer.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ NEGEMMConvolutionLayer() [1/3]

◆ NEGEMMConvolutionLayer() [2/3]

◆ NEGEMMConvolutionLayer() [3/3]

◆ ~NEGEMMConvolutionLayer()

Member Function Documentation

◆ configure()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ prepare()

◆ run()

◆ validate()