Function to run the deconvolution layer through a call to GEMM. More...

#include <CLGEMMDeconvolutionLayer.h>

Collaboration diagram for CLGEMMDeconvolutionLayer:

Public Member Functions
	CLGEMMDeconvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
	Constructor. More...

	CLGEMMDeconvolutionLayer (const CLGEMMDeconvolutionLayer &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	CLGEMMDeconvolutionLayer (CLGEMMDeconvolutionLayer &&)=default
	Default move constructor. More...

CLGEMMDeconvolutionLayer &	operator= (const CLGEMMDeconvolutionLayer &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

CLGEMMDeconvolutionLayer &	operator= (CLGEMMDeconvolutionLayer &&)=default
	Default move assignment operator. More...

	~CLGEMMDeconvolutionLayer ()
	Default desctructor. More...

void	configure (const ICLTensor input, const ICLTensor weights, const ICLTensor bias, ICLTensor output, const PadStrideInfo &deconv_info)
	Set the input, weights, biases and output tensors. More...

void	configure (const CLCompileContext &compile_context, const ICLTensor input, const ICLTensor weights, const ICLTensor bias, ICLTensor output, const PadStrideInfo &deconv_info)
	Set the input, weights, biases and output tensors. More...

void	run () override
	Run the kernels contained in the function. More...

void	prepare () override
	Prepare the function for executing. More...

Public Member Functions inherited from IFunction
virtual	~IFunction ()=default
	Destructor. More...

Static Public Member Functions
static Status	validate (const ITensorInfo input, const ITensorInfo weights, const ITensorInfo bias, const ITensorInfo output, const PadStrideInfo &deconv_info)
	Static function to check if given info will lead to a valid configuration of CLDeconvolutionLayer. More...

Detailed Description

Function to run the deconvolution layer through a call to GEMM.

Deconvolution Layer is the backward pass of Convolution Layer. First we transform the input depending on the stride and pad info and then perform a 1x1 convolution pass. Input stride defines how many zeroes we should put between each element of the input, pad is the amount of padding and finally a is a user specified value where a < stride - 1, that increases the padding top and right of the input image.

The relation between input to output is as follows:

\[ width\_output = (width\_input - 1) \cdot stride\_x - 2 \cdot padding\_x + kernel\_x \]

\[ height\_output = (height\_input - 1) \cdot stride\_y - 2 \cdot padding\_y + kernel\_y \]

where: width_input is the size of the first input dimension. height_input is the size of the second input dimension. width_output is the size of the first output dimension. height_output is the size of the second output dimension. kernel_x and kernel_y are the convolution sizes in x and y. stride_x and stride_y is the input stride of the first and second dimension.

The weights used by Deconvolution are supposed to be the same as the ones used for Convolution.

This function calls the following OpenCL kernels/functions:

Definition at line 79 of file CLGEMMDeconvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLGEMMDeconvolutionLayer() [1/3]

CLGEMMDeconvolutionLayer ( std::shared_ptr< IMemoryManager > memory_manager = nullptr )

Constructor.

Definition at line 107 of file CLGEMMDeconvolutionLayer.cpp.

     : _memory_group(std::move(memory_manager)),
       _mm_gemm(),
       _mm_gemmlowp(),
       _gemmlowp_output_stage(),
       _permute_input_to_nhwc(),
       _permute_weights_to_nhwc(),
       _reshape_weights(),
       _transpose_weights(),
       _deconv_reshape(std::make_unique<CLDeconvolutionReshapeOutputKernel>()),
       _slice_gemm(),
       _gemmlowp_final(),
       _reshaped_weights(),
       _reshaped_weights_t(),
       _permuted_input(),
       _permuted_weights(),
       _gemm_output(),
       _slice_gemm_input(),
       _original_weights(),
       _is_prepared(false),
       _padded_input(false),
       _is_nchw(false),
       _is_quantized(false)
 {
 }

◆ CLGEMMDeconvolutionLayer() [2/3]

CLGEMMDeconvolutionLayer ( const CLGEMMDeconvolutionLayer & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLGEMMDeconvolutionLayer() [3/3]

CLGEMMDeconvolutionLayer ( CLGEMMDeconvolutionLayer && )

default

Default move constructor.

◆ ~CLGEMMDeconvolutionLayer()

~CLGEMMDeconvolutionLayer ( )

default

Default desctructor.

Member Function Documentation

◆ configure() [1/2]

void configure	(	const ICLTensor *	input,
		const ICLTensor *	weights,
		const ICLTensor *	bias,
		ICLTensor *	output,
		const PadStrideInfo &	deconv_info
	)

Set the input, weights, biases and output tensors.

Parameters

[in,out]	input	Input tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC
[in]	weights	The 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as `input`. Data layout supported: same as `input`.
[in]	bias	(Optional) The biases have one dimension. Data type supported: Same as `input`. Data layout supported: same as `input`.
[out]	output	Output tensor. The output has the same number of dimensions as the `input`. Data layout supported: same as `input`.
[in]	deconv_info	Contains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. This function supports only stride_x = weights.width && stride_y = weights.height. Moreover, padding is not supported.

Definition at line 230 of file CLGEMMDeconvolutionLayer.cpp.

References CLKernelLibrary::get().

 {
     configure(CLKernelLibrary::get().get_compile_context(), input, weights, bias, output, deconv_info);
 }

◆ configure() [2/2]

void configure	(	const CLCompileContext &	compile_context,
		const ICLTensor *	input,
		const ICLTensor *	weights,
		const ICLTensor *	bias,
		ICLTensor *	output,
		const PadStrideInfo &	deconv_info
	)

Set the input, weights, biases and output tensors.

Parameters

[in]	compile_context	The compile context to be used.
[in,out]	input	Input tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC
[in]	weights	The 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as `input`. Data layout supported: same as `input`.
[in]	bias	(Optional) The biases have one dimension. Data type supported: Same as `input`. Data layout supported: same as `input`.
[out]	output	Output tensor. The output has the same number of dimensions as the `input`. Data layout supported: same as `input`.
[in]	deconv_info	Contains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. This function supports only stride_x = weights.width && stride_y = weights.height. Moreover, padding is not supported.

Definition at line 235 of file CLGEMMDeconvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, CLTranspose::configure(), CLReshapeLayer::configure(), CLPermute::configure(), CLGEMMLowpMatrixMultiplyCore::configure(), CLSlice::configure(), CLGEMM::configure(), CLGEMMLowpOutputStage::configure(), ITensorInfo::data_layout(), ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, ITensor::info(), CLTensor::info(), ITensorAllocator::init(), arm_compute::test::validation::input, arm_compute::is_data_type_quantized_asymmetric(), MemoryGroup::manage(), arm_compute::NCHW, UniformQuantizationInfo::offset, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), ITensorInfo::quantization_info(), UniformQuantizationInfo::scale, TensorInfo::set_quantization_info(), arm_compute::U, QuantizationInfo::uniform(), and CLGEMMDeconvolutionLayer::validate().

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
     ARM_COMPUTE_ERROR_THROW_ON(CLGEMMDeconvolutionLayer::validate(input->info(),
                                                                   weights->info(),
                                                                   bias != nullptr ? bias->info() : nullptr,
                                                                   output->info(),
                                                                   deconv_info));
 
     _original_weights = weights;
     _padded_input     = deconv_info.pad_bottom() > 0 || deconv_info.pad_left() > 0 || deconv_info.pad_right() > 0 || deconv_info.pad_top() > 0;
     _is_nchw          = input->info()->data_layout() == DataLayout::NCHW;
     _is_quantized     = is_data_type_quantized_asymmetric(input->info()->data_type());
 
     const ICLTensor *input_to_use   = input;
     const ICLTensor *weights_to_use = weights;
 
     // If the data layout is NCHW, transform everything in NHWC. Another alternative could be to
     // do an outer product in NCHW and then an accumulation through a reduction. This would have two
     // drawbacks: first, the outer product is less efficient than a full GEMM. Second, the reduction
     // might be slower than GEMM.
     if(_is_nchw)
     {
         _memory_group.manage(&_permuted_input);
         _permute_input_to_nhwc.configure(compile_context, input, &_permuted_input, PermutationVector(2U, 0U, 1U));
 
         _permute_weights_to_nhwc.configure(compile_context, weights, &_permuted_weights, PermutationVector(2U, 0U, 1U));
 
         input_to_use   = &_permuted_input;
         weights_to_use = &_permuted_weights;
     }
 
     // Reshape the input weights. The weights will be reshaped only once during the call to prepare()
     _reshaped_weights.allocator()->init(TensorInfo(TensorShape(weights_to_use->info()->dimension(0),
                                                                weights_to_use->info()->dimension(1) * weights_to_use->info()->dimension(2) * weights_to_use->info()->dimension(3)),
                                                    1,
                                                    input->info()->data_type(), weights->info()->quantization_info()));
 
     _reshape_weights.configure(compile_context, weights_to_use, &_reshaped_weights);
     _transpose_weights.configure(compile_context, &_reshaped_weights, &_reshaped_weights_t);
 
     const size_t idx_h = get_data_layout_dimension_index(input->info()->data_layout(), DataLayoutDimension::HEIGHT);
     GEMMInfo     gemm_info(false, false, true, input->info()->dimension(idx_h), true);
 
     // Configure output stage for asymmetric quantized types
     if(_is_quantized)
     {
         // gemmlowp adds the offsets (instead of subtracting them). Thus, we need to negate the original
         // and restore them back to make it work properly.
         QuantizationInfo iq_info = input->info()->quantization_info();
         QuantizationInfo wq_info = weights->info()->quantization_info();
 
         input_to_use->info()->set_quantization_info(QuantizationInfo(iq_info.uniform().scale, -iq_info.uniform().offset));
         _reshaped_weights_t.info()->set_quantization_info(QuantizationInfo(wq_info.uniform().scale, -wq_info.uniform().offset));
 
         _mm_gemmlowp.configure(compile_context, input_to_use, &_reshaped_weights_t, nullptr, &_gemm_output, gemm_info);
 
         input_to_use->info()->set_quantization_info(iq_info);
         _reshaped_weights_t.info()->set_quantization_info(wq_info);
     }
     else
     {
         _mm_gemm.configure(compile_context, input_to_use, &_reshaped_weights_t, nullptr, &_gemm_output, 1.f, 0.0f, gemm_info);
     }
 
     if(_is_nchw)
     {
         _permuted_input.allocator()->allocate();
     }
 
     ICLTensor *deconv_reshape_output = nullptr;
     ICLTensor *slice_output          = nullptr;
     ICLTensor *output_stage_output   = nullptr;
 
     if(_padded_input && _is_quantized)
     {
         _memory_group.manage(&_slice_gemm_input);
         _memory_group.manage(&_gemmlowp_final);
         deconv_reshape_output = &_gemmlowp_final;
         output_stage_output   = &_slice_gemm_input;
         slice_output          = output;
     }
     else if(_padded_input)
     {
         _memory_group.manage(&_slice_gemm_input);
         deconv_reshape_output = &_slice_gemm_input;
         slice_output          = output;
     }
     else if(_is_quantized)
     {
         _memory_group.manage(&_gemmlowp_final);
         deconv_reshape_output = &_gemmlowp_final;
         output_stage_output   = output;
     }
     else
     {
         deconv_reshape_output = output;
     }
 
     // Configure a Col2Im call to reshape the output of GEMM
     _deconv_reshape->configure(compile_context, &_gemm_output, bias, deconv_reshape_output, input->info(), weights->info(), deconv_info);
     _gemm_output.allocator()->allocate();
 
     if(_is_quantized)
     {
         GEMMLowpOutputStageInfo output_stage_info;
         construct_gemmlowp_output_stage(input->info(), weights->info(), output->info(), output_stage_info);
         _gemmlowp_output_stage.configure(compile_context, &_gemmlowp_final, nullptr, output_stage_output, output_stage_info);
         _gemmlowp_final.allocator()->allocate();
     }
 
     // If the input was padded, the output needs to be sliced.
     if(_padded_input)
     {
         const auto start_end = compute_start_end_slice_coordinates(*deconv_reshape_output->info(), deconv_info, _is_nchw);
         _slice_gemm.configure(compile_context, &_slice_gemm_input, slice_output, start_end.first, start_end.second);
         _slice_gemm_input.allocator()->allocate();
     }
 }

◆ operator=() [1/2]

CLGEMMDeconvolutionLayer& operator= ( const CLGEMMDeconvolutionLayer & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

CLGEMMDeconvolutionLayer& operator= ( CLGEMMDeconvolutionLayer && )

default

Default move assignment operator.

◆ prepare()

void prepare ( )

overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note: Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 389 of file CLGEMMDeconvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, CLTensorAllocator::free(), ITensor::is_used(), ITensor::mark_as_unused(), CLGEMMLowpMatrixMultiplyCore::prepare(), CLGEMM::prepare(), ICLSimpleFunction::run(), CLReshapeLayer::run(), and CLPermute::run().

Referenced by CLGEMMDeconvolutionLayer::run().

 {
     if(!_is_prepared)
     {
         ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
 
         if(_is_nchw)
         {
             _permuted_weights.allocator()->allocate();
             _permute_weights_to_nhwc.run();
         }
 
         _reshaped_weights.allocator()->allocate();
         _reshape_weights.run();
 
         if(_is_nchw)
         {
             _permuted_weights.allocator()->free();
         }
 
         _reshaped_weights_t.allocator()->allocate();
         _transpose_weights.run();
 
         // Prepare gemm
         if(!_is_quantized)
         {
             _mm_gemm.prepare();
         }
         else
         {
             _mm_gemmlowp.prepare();
         }
 
         // Free resources
         if(!_reshaped_weights_t.is_used())
         {
             _reshaped_weights_t.allocator()->free();
         }
 
         _original_weights->mark_as_unused();
         _is_prepared = true;
     }
 }

◆ run()

void run ( )

overridevirtual

Run the kernels contained in the function.

For Neon kernels:

Multi-threading is used for the kernels which are parallelisable.
By default std::thread::hardware_concurrency() threads are used.

Note: CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

All the kernels are enqueued on the queue associated with CLScheduler.
The queue is then flushed.

Note: The function will not block until the kernels are executed. It is the user's responsibility to wait.; Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 356 of file CLGEMMDeconvolutionLayer.cpp.

References CLScheduler::enqueue(), CLScheduler::get(), CLGEMMDeconvolutionLayer::prepare(), ICLSimpleFunction::run(), CLPermute::run(), CLGEMMLowpMatrixMultiplyCore::run(), CLSlice::run(), and CLGEMM::run().

 {
     prepare();
 
     MemoryGroupResourceScope scope_mg(_memory_group);
 
     if(_is_nchw)
     {
         _permute_input_to_nhwc.run();
     }
 
     if(_is_quantized)
     {
         _mm_gemmlowp.run();
     }
     else
     {
         _mm_gemm.run();
     }
 
     CLScheduler::get().enqueue(*_deconv_reshape, false);
 
     if(_is_quantized)
     {
         _gemmlowp_output_stage.run();
     }
 
     if(_padded_input)
     {
         _slice_gemm.run();
     }
 }

◆ validate()

Status validate	(	const ITensorInfo *	input,
		const ITensorInfo *	weights,
		const ITensorInfo *	bias,
		const ITensorInfo *	output,
		const PadStrideInfo &	deconv_info
	)

static

Static function to check if given info will lead to a valid configuration of CLDeconvolutionLayer.

Parameters

[in]	input	Input tensor info. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC
[in]	weights	The 4d weights info with dimensions [width, height, IFM, OFM]. Data type supported: Same as `input`. Data layout supported: same as `input`.
[in]	bias	(Optional) The biases have one dimension. Data type supported: Same as `input`. Data layout supported: same as `input`.
[in]	output	Output tensor info. The output has the same number of dimensions as the `input`. Data layout supported: same as `input`.
[in]	deconv_info	Contains padding and policies to be used in the deconvolution, this is described in PadStrideInfo.

Returns: a status

Definition at line 135 of file CLGEMMDeconvolutionLayer.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::BATCHES, ICloneable< T >::clone(), TensorInfo::clone(), arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::is_data_type_quantized_asymmetric(), arm_compute::NCHW, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), arm_compute::permute(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::S32, TensorInfo::set_data_type(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), CLTranspose::validate(), CLReshapeLayer::validate(), CLPermute::validate(), CLDeconvolutionReshapeOutputKernel::validate(), CLGEMMLowpMatrixMultiplyCore::validate(), CLSlice::validate(), CLGEMM::validate(), CLGEMMLowpOutputStage::validate(), and arm_compute::WIDTH.

Referenced by CLGEMMDeconvolutionLayer::configure(), and CLDeconvolutionLayer::validate().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(input, weights, output);
     ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(input, 1, DataType::F32, DataType::F16, DataType::QASYMM8, DataType::QASYMM8_SIGNED);
     ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, weights);
     ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(input, weights);
 
     DataLayout data_layout  = input->data_layout();
     const bool padded_input = deconv_info.pad_bottom() > 0 || deconv_info.pad_left() > 0 || deconv_info.pad_right() > 0 || deconv_info.pad_top() > 0;
     const bool is_nchw      = input->data_layout() == DataLayout::NCHW;
     const bool is_quantized = is_data_type_quantized_asymmetric(input->data_type());
 
     const size_t idx_w = get_data_layout_dimension_index(data_layout, DataLayoutDimension::WIDTH);
     const size_t idx_h = get_data_layout_dimension_index(data_layout, DataLayoutDimension::HEIGHT);
     const size_t idx_b = get_data_layout_dimension_index(data_layout, DataLayoutDimension::BATCHES);
 
     ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_w) != deconv_info.stride().first);
     ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_h) != deconv_info.stride().second);
 
     TensorShape nhwc_weights_shape = weights->tensor_shape();
     TensorShape nhwc_input_shape   = input->tensor_shape();
 
     if(is_nchw)
     {
         permute(nhwc_weights_shape, PermutationVector(2, 0, 1));
         permute(nhwc_input_shape, PermutationVector(2, 0, 1));
 
         TensorInfo nhwc_input_info = input->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(nhwc_input_shape).set_data_layout(DataLayout::NCHW);
 
         TensorInfo nhwc_weights_info = weights->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(nhwc_weights_shape).set_data_layout(DataLayout::NCHW);
 
         CLPermute::validate(weights, &nhwc_weights_info, PermutationVector(2, 0, 1));
         CLPermute::validate(input, &nhwc_input_info, PermutationVector(2, 0, 1));
     }
 
     const TensorShape reshaped_shape = TensorShape(nhwc_weights_shape[0], nhwc_weights_shape[1] * nhwc_weights_shape[2] * nhwc_weights_shape[3]);
     const TensorInfo  reshaped_info  = weights->clone()->set_tensor_shape(reshaped_shape).set_data_layout(DataLayout::NCHW).set_is_resizable(true);
     ARM_COMPUTE_RETURN_ON_ERROR(CLReshapeLayer::validate(weights, &reshaped_info));
 
     TensorShape      transposed_shape(reshaped_shape[1], reshaped_shape[0]);
     const TensorInfo reshaped_t_info = reshaped_info.clone()->set_is_resizable(true).set_tensor_shape(transposed_shape);
     ARM_COMPUTE_RETURN_ON_ERROR(CLTranspose::validate(&reshaped_info, &reshaped_t_info));
 
     TensorShape gemm_output_shape(weights->dimension(idx_w) * weights->dimension(idx_h) * weights->dimension(idx_b),
                                   input->dimension(idx_w),
                                   input->dimension(idx_h),
                                   input->dimension(idx_b));
 
     TensorInfo gemm_output_info = reshaped_t_info.clone()->set_tensor_shape(gemm_output_shape).set_is_resizable(true);
     GEMMInfo   gemm_info(false, false, true, input->dimension(idx_h), true);
 
     GEMMLowpOutputStageInfo output_stage_info;
 
     if(is_quantized)
     {
         ARM_COMPUTE_RETURN_ON_ERROR(CLGEMMLowpMatrixMultiplyCore::validate(&input->clone()->set_tensor_shape(nhwc_input_shape), &reshaped_t_info, nullptr, &gemm_output_info.set_data_type(DataType::S32),
                                                                            gemm_info));
         ARM_COMPUTE_RETURN_ON_ERROR(construct_gemmlowp_output_stage(input, weights, output, output_stage_info));
     }
     else
     {
         ARM_COMPUTE_RETURN_ON_ERROR(CLGEMM::validate(&input->clone()->set_tensor_shape(nhwc_input_shape).set_is_resizable(true), &reshaped_t_info, nullptr, &gemm_output_info, 1.0f, 0.0f, gemm_info));
     }
 
     const PadStrideInfo stride_info(deconv_info.stride().first, deconv_info.stride().second);
     auto                out_dims           = deconvolution_output_dimensions(input->dimension(idx_w), input->dimension(idx_h), weights->dimension(idx_w), weights->dimension(idx_h), stride_info);
     const TensorShape   deconv_shape       = misc::shape_calculator::compute_deconvolution_output_shape(out_dims, *input, *weights);
     TensorInfo          col2im_output_info = gemm_output_info.clone()->set_tensor_shape(deconv_shape).set_is_resizable(true);
 
     if(padded_input && is_quantized)
     {
         const auto start_end = compute_start_end_slice_coordinates(col2im_output_info, deconv_info, is_nchw);
         ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionReshapeOutputKernel::validate(&gemm_output_info, bias, &col2im_output_info, input, weights, deconv_info));
         ARM_COMPUTE_RETURN_ON_ERROR(CLGEMMLowpOutputStage::validate(&col2im_output_info, nullptr, &col2im_output_info.clone()->set_is_resizable(true).set_data_type(input->data_type()), output_stage_info));
         ARM_COMPUTE_RETURN_ON_ERROR(CLSlice::validate(&col2im_output_info.clone()->set_is_resizable(true).set_data_type(input->data_type()), output, start_end.first, start_end.second));
     }
     else if(padded_input)
     {
         const auto start_end = compute_start_end_slice_coordinates(col2im_output_info, deconv_info, is_nchw);
         ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionReshapeOutputKernel::validate(&gemm_output_info, bias, &col2im_output_info, input, weights, deconv_info));
         ARM_COMPUTE_RETURN_ON_ERROR(CLSlice::validate(&col2im_output_info, output, start_end.first, start_end.second));
     }
     else if(is_quantized)
     {
         ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionReshapeOutputKernel::validate(&gemm_output_info, bias, &col2im_output_info, input, weights, deconv_info));
         ARM_COMPUTE_RETURN_ON_ERROR(CLGEMMLowpOutputStage::validate(&col2im_output_info, nullptr, output, output_stage_info));
     }
     else
     {
         ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionReshapeOutputKernel::validate(&gemm_output_info, bias, output, input, weights, deconv_info));
     }
 
     return Status{};
 }

The documentation for this class was generated from the following files:

arm_compute/runtime/CL/functions/CLGEMMDeconvolutionLayer.h
src/runtime/CL/functions/CLGEMMDeconvolutionLayer.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ CLGEMMDeconvolutionLayer() [1/3]

◆ CLGEMMDeconvolutionLayer() [2/3]

◆ CLGEMMDeconvolutionLayer() [3/3]

◆ ~CLGEMMDeconvolutionLayer()

Member Function Documentation

◆ configure() [1/2]

◆ configure() [2/2]

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ prepare()

◆ run()

◆ validate()