Function to run the deconvolution layer. More...

#include <CLDirectDeconvolutionLayer.h>

Collaboration diagram for CLDirectDeconvolutionLayer:

Public Member Functions
	CLDirectDeconvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
	Constructor. More...

	CLDirectDeconvolutionLayer (const CLDirectDeconvolutionLayer &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	CLDirectDeconvolutionLayer (CLDirectDeconvolutionLayer &&)=default
	Default move constructor. More...

CLDirectDeconvolutionLayer &	operator= (const CLDirectDeconvolutionLayer &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

CLDirectDeconvolutionLayer &	operator= (CLDirectDeconvolutionLayer &&)=default
	Default move assignment operator. More...

void	configure (ICLTensor input, ICLTensor weights, const ICLTensor bias, ICLTensor output, const PadStrideInfo &info, const WeightsInfo &weights_info=WeightsInfo())
	Set the input, weights, biases and output tensors. More...

void	configure (const CLCompileContext &compile_context, ICLTensor input, ICLTensor weights, const ICLTensor bias, ICLTensor output, const PadStrideInfo &info, const WeightsInfo &weights_info=WeightsInfo())
	Set the input, weights, biases and output tensors. More...

void	run () override
	Run the kernels contained in the function. More...

void	prepare () override
	Prepare the function for executing. More...

Public Member Functions inherited from IFunction
virtual	~IFunction ()=default
	Destructor. More...

Static Public Member Functions
static Status	validate (const ITensorInfo input, const ITensorInfo weights, const ITensorInfo bias, ITensorInfo output, const PadStrideInfo &info, const WeightsInfo &weights_info=WeightsInfo())
	Static function to check if given info will lead to a valid configuration of CLDirectDeconvolutionLayer. More...

Detailed Description

Function to run the deconvolution layer.

Deconvolution Layer is the backward pass of Convolution Layer. First we transform the input depending on the stride and pad info and then perform a 1x1 convolution pass. Input stride defines how many zeroes we should put between each element of the input and pad is the amount of padding.

The relation between input to output is as follows:

\[ width\_output = (width\_input - 1) \cdot stride\_x - 2 \cdot padding\_x + kernel\_x \]

\[ height\_output = (height\_input - 1) \cdot stride\_y - 2 \cdot padding\_y + kernel\_y \]

where: width_input is the size of the first input dimension. height_input is the size of the second input dimension. width_output is the size of the first output dimension. height_output is the size of the second output dimension. kernel_x and kernel_y are the convolution sizes in x and y. stride_x and stride_y is the input stride of the first and second dimension.

The weights used by Deconvolution are supposed to be the same as the ones used for Convolution. Therefore, it will be necessary to use the weights in the reverse order to perform an actual convolution. This is achieved by using CLReverse.

This function calls the following OpenCL kernels/functions:

And the following CPP kernels:

CLReverse

Definition at line 75 of file CLDirectDeconvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLDirectDeconvolutionLayer() [1/3]

CLDirectDeconvolutionLayer ( std::shared_ptr< IMemoryManager > memory_manager = nullptr )

Constructor.

Definition at line 45 of file CLDirectDeconvolutionLayer.cpp.

     : _memory_group(std::move(memory_manager)),
       _scale_f(),
       _conv_f(),
       _flip_weights(),
       _scaled_output(),
       _original_weights(nullptr),
       _weights_flipped(),
       _flip_axis(),
       _is_prepared(false)
 {
 }

◆ CLDirectDeconvolutionLayer() [2/3]

CLDirectDeconvolutionLayer ( const CLDirectDeconvolutionLayer & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLDirectDeconvolutionLayer() [3/3]

CLDirectDeconvolutionLayer ( CLDirectDeconvolutionLayer && )

default

Default move constructor.

Member Function Documentation

◆ configure() [1/2]

void configure	(	ICLTensor *	input,
		ICLTensor *	weights,
		const ICLTensor *	bias,
		ICLTensor *	output,
		const PadStrideInfo &	info,
		const WeightsInfo &	weights_info = `WeightsInfo()`
	)

Set the input, weights, biases and output tensors.

Parameters

[in,out]	input	Input tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8_SIGNED/QASYMM8/F16/F32.
[in]	weights	The 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as `input`.
[in]	bias	(Optional) The biases have one dimension. Data type supported: Should match `input` data type, except for input of QASYMM8 and QASYMM8_SIGNED type where biases should be of S32 type
[out]	output	Output tensor. The output has the same number of dimensions as the `input`.
[in]	info	Contains padding and policies to be used in the deconvolution, this is decribed in PadStrideInfo.
[in]	weights_info	(Optional) Weights information needed for CLConvolutionLayer, specifies if the weights tensor has been reshaped with CLWeightsReshapeKernel.

Definition at line 110 of file CLDirectDeconvolutionLayer.cpp.

References CLKernelLibrary::get().

 {
     configure(CLKernelLibrary::get().get_compile_context(), input, weights, bias, output, info, weights_info);
 }

◆ configure() [2/2]

void configure	(	const CLCompileContext &	compile_context,
		ICLTensor *	input,
		ICLTensor *	weights,
		const ICLTensor *	bias,
		ICLTensor *	output,
		const PadStrideInfo &	info,
		const WeightsInfo &	weights_info = `WeightsInfo()`
	)

Set the input, weights, biases and output tensors.

Parameters

[in]	compile_context	The compile context to be used.
[in,out]	input	Input tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8_SIGNED/QASYMM8/F16/F32.
[in]	weights	The 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as `input`.
[in]	bias	(Optional) The biases have one dimension. Data type supported: Should match `input` data type, except for input of QASYMM8 and QASYMM8_SIGNED type where biases should be of S32 type
[out]	output	Output tensor. The output has the same number of dimensions as the `input`.
[in]	info	Contains padding and policies to be used in the deconvolution, this is decribed in PadStrideInfo.
[in]	weights_info	(Optional) Weights information needed for CLConvolutionLayer, specifies if the weights tensor has been reshaped with CLWeightsReshapeKernel.

Definition at line 116 of file CLDirectDeconvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), ICLTensor::buffer(), arm_compute::CEIL, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::misc::shape_calculator::compute_deconvolution_upsampled_shape(), CLReverse::configure(), CLDeconvolutionLayerUpsample::configure(), CLConvolutionLayer::configure(), arm_compute::test::validation::conv_info, arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), arm_compute::FLOOR, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, ITensor::info(), arm_compute::test::validation::info, ITensorAllocator::init(), MemoryGroup::manage(), CLTensor::map(), arm_compute::NHWC, arm_compute::test::validation::output_shape, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), ITensorInfo::quantization_info(), WeightsInfo::retain_internal_weights(), TensorInfo::set_data_layout(), PadStrideInfo::stride(), arm_compute::U, arm_compute::U32, CLTensor::unmap(), CLDirectDeconvolutionLayer::validate(), and arm_compute::WIDTH.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
 
     const unsigned int pad_left   = info.pad_left();
     const unsigned int pad_right  = info.pad_right();
     const unsigned int pad_top    = info.pad_top();
     const unsigned int pad_bottom = info.pad_bottom();
     const unsigned int stride_x   = info.stride().first;
     const unsigned int stride_y   = info.stride().second;
 
     const DataLayout data_layout = input->info()->data_layout();
 
     const size_t idx_w = get_data_layout_dimension_index(data_layout, DataLayoutDimension::WIDTH);
     const size_t idx_h = get_data_layout_dimension_index(data_layout, DataLayoutDimension::HEIGHT);
 
     _original_weights = weights;
     _flip_axis.allocator()->init(TensorInfo(TensorShape(2U), 1, DataType::U32));
     _weights_flipped.allocator()->init(weights->info()->clone()->set_data_layout(data_layout));
     _flip_weights.configure(compile_context, weights, &_weights_flipped, &_flip_axis);
 
     auto out_dims = deconvolution_output_dimensions(input->info()->dimension(idx_w), input->info()->dimension(idx_h), weights->info()->dimension(idx_w), weights->info()->dimension(idx_h), info);
 
     const TensorShape output_shape = compute_deconvolution_output_shape(out_dims, *input->info(), *weights->info());
 
     // Output auto initialization if not yet initialized
     auto_init_if_empty(*output->info(), input->info()->clone()->set_tensor_shape(output_shape).set_data_layout(data_layout));
 
     // Perform validation step
     ARM_COMPUTE_ERROR_THROW_ON(CLDirectDeconvolutionLayer::validate(input->info(), weights->info(), bias == nullptr ? nullptr : bias->info(), output->info(), info));
 
     _is_prepared = weights_info.retain_internal_weights();
 
     _memory_group.manage(&_scaled_output);
 
     // Find the upsampled dimensions and the padding needed for the convolution with stride 1 in order to match output shape
     unsigned int      deconv_pad_x    = 0;
     unsigned int      deconv_pad_y    = 0;
     const TensorShape scale_out_shape = compute_deconvolution_upsampled_shape(*input->info(), *weights->info(), stride_x, stride_y, out_dims, deconv_pad_x, deconv_pad_y);
 
     unsigned int deconv_pad_left  = pad_right > pad_left ? pad_right - pad_left : 0;
     unsigned int deconv_pad_right = pad_left > pad_right ? pad_left - pad_right : 0;
     deconv_pad_x -= deconv_pad_left + deconv_pad_right;
     ARM_COMPUTE_ERROR_ON((deconv_pad_x % 2) != 0);
     deconv_pad_left += deconv_pad_x / 2;
     deconv_pad_right += deconv_pad_x / 2;
 
     unsigned int deconv_pad_top    = pad_bottom > pad_top ? pad_bottom - pad_top : 0;
     unsigned int deconv_pad_bottom = pad_top > pad_bottom ? pad_top - pad_bottom : 0;
     deconv_pad_y -= deconv_pad_top + deconv_pad_bottom;
     ARM_COMPUTE_ERROR_ON((deconv_pad_y % 2) != 0);
     deconv_pad_top += deconv_pad_y / 2;
     deconv_pad_bottom += deconv_pad_y / 2;
 
     TensorInfo scale_out_info(scale_out_shape, 1, input->info()->data_type(), input->info()->quantization_info());
     scale_out_info.set_data_layout(data_layout);
     _scaled_output.allocator()->init(scale_out_info);
 
     // configure scale function
     const PadStrideInfo upsample_info(stride_x, stride_y, deconv_pad_left, deconv_pad_right, deconv_pad_top, deconv_pad_bottom, DimensionRoundingType::FLOOR);
     _scale_f.configure(compile_context, input, &_scaled_output, upsample_info);
 
     // Setup the function to convolve the upscaled output
     const PadStrideInfo conv_info(1, 1, 0, 0, 0, 0, DimensionRoundingType::CEIL);
     _conv_f.configure(compile_context, &_scaled_output, &_weights_flipped, bias, output, conv_info, weights_info);
     _scaled_output.allocator()->allocate();
 
     // Setup flip axis data
     _flip_axis.allocator()->allocate();
     _flip_axis.map(true);
     auto axis_data = reinterpret_cast<uint32_t *>(_flip_axis.buffer());
     if(weights->info()->data_layout() == DataLayout::NHWC)
     {
         axis_data[0] = 1;
         axis_data[1] = 2;
     }
     else
     {
         axis_data[0] = 0;
         axis_data[1] = 1;
     }
     _flip_axis.unmap();
 }

◆ operator=() [1/2]

CLDirectDeconvolutionLayer& operator= ( const CLDirectDeconvolutionLayer & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

CLDirectDeconvolutionLayer& operator= ( CLDirectDeconvolutionLayer && )

default

Default move assignment operator.

◆ prepare()

void prepare ( )

overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note: Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 211 of file CLDirectDeconvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, CLTensorAllocator::free(), ITensor::is_used(), ITensor::mark_as_unused(), CLConvolutionLayer::prepare(), and ICLSimpleFunction::run().

Referenced by CLDirectDeconvolutionLayer::run().

 {
     if(!_is_prepared)
     {
         ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
 
         // Run weights flipping and mark original weights tensor as unused
         _weights_flipped.allocator()->allocate();
         _flip_weights.run();
         _original_weights->mark_as_unused();
 
         // Prepare convolution
         _conv_f.prepare();
 
         // Free flipped weights
         if(!_weights_flipped.is_used())
         {
             _weights_flipped.allocator()->free();
         }
 
         _is_prepared = true;
     }
 }

◆ run()

void run ( )

overridevirtual

Run the kernels contained in the function.

For Neon kernels:

Multi-threading is used for the kernels which are parallelisable.
By default std::thread::hardware_concurrency() threads are used.

Note: CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

All the kernels are enqueued on the queue associated with CLScheduler.
The queue is then flushed.

Note: The function will not block until the kernels are executed. It is the user's responsibility to wait.; Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 201 of file CLDirectDeconvolutionLayer.cpp.

References CLDirectDeconvolutionLayer::prepare(), CLDeconvolutionLayerUpsample::run(), and CLConvolutionLayer::run().

 {
     prepare();
 
     MemoryGroupResourceScope scope_mg(_memory_group);
 
     _scale_f.run();
     _conv_f.run();
 }

◆ validate()

Status validate	(	const ITensorInfo *	input,
		const ITensorInfo *	weights,
		const ITensorInfo *	bias,
		ITensorInfo *	output,
		const PadStrideInfo &	info,
		const WeightsInfo &	weights_info = `WeightsInfo()`
	)

static

Static function to check if given info will lead to a valid configuration of CLDirectDeconvolutionLayer.

Parameters

[in]	input	Input tensor info. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8_SIGNED/QASYMM8/F16/F32.
[in]	weights	The 4d weights info with dimensions [width, height, IFM, OFM]. Data type supported: Same as `input`.
[in]	bias	(Optional) The biases have one dimension. Data type supported: Should match `input` data type, except for input of QASYMM8 and QASYMM8_SIGNED type where biases should be of S32 type
[in]	output	Output tensor info. The output has the same number of dimensions as the `input`.
[in]	info	Contains padding and policies to be used in the deconvolution, this is decribed in PadStrideInfo.
[in]	weights_info	(Optional) Weights information needed for CLConvolutionLayer, specifies if the weights tensor has been reshaped with CLWeightsReshapeKernel.

Returns: a status

Definition at line 58 of file CLDirectDeconvolutionLayer.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::CEIL, arm_compute::CHANNEL, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::misc::shape_calculator::compute_deconvolution_upsampled_shape(), arm_compute::test::validation::conv_info, arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::info, arm_compute::is_data_type_quantized_asymmetric(), arm_compute::test::validation::output_shape, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::S32, PadStrideInfo::stride(), CLDeconvolutionLayerUpsample::validate(), CLConvolutionLayer::validate(), and arm_compute::WIDTH.

Referenced by CLDirectDeconvolutionLayer::configure(), and CLDeconvolutionLayer::validate().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(input, weights, output);
     ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(input, 1, DataType::QASYMM8_SIGNED, DataType::QASYMM8, DataType::F16, DataType::F32);
     ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(input, weights);
     const DataLayout data_layout = input->data_layout();
 
     const size_t idx_w = get_data_layout_dimension_index(data_layout, DataLayoutDimension::WIDTH);
     const size_t idx_h = get_data_layout_dimension_index(data_layout, DataLayoutDimension::HEIGHT);
     const size_t idx_c = get_data_layout_dimension_index(data_layout, DataLayoutDimension::CHANNEL);
 
     ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_w) != weights->dimension(idx_h));
     ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_w) < 1);
 
     auto out_dims = deconvolution_output_dimensions(input->dimension(idx_w), input->dimension(idx_h), weights->dimension(idx_w), weights->dimension(idx_h), info);
 
     const TensorShape output_shape = compute_deconvolution_output_shape(out_dims, *input, *weights);
 
     ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, output, weights);
 
     if(bias != nullptr)
     {
         if(is_data_type_quantized_asymmetric(input->data_type()))
         {
             ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(bias, 1, DataType::S32);
         }
         else
         {
             ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, bias);
         }
         ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(input, bias);
     }
 
     ARM_COMPUTE_RETURN_ERROR_ON_MSG(output->dimension(idx_w) != output_shape[idx_w], "Output's width is invalid.");
     ARM_COMPUTE_RETURN_ERROR_ON_MSG(output->dimension(idx_h) != output_shape[idx_h], "Output's height is invalid.");
     ARM_COMPUTE_RETURN_ERROR_ON_MSG(output->dimension(idx_c) != output_shape[idx_c], "Output's depth is invalid.");
 
     unsigned int        deconv_pad_x    = 0;
     unsigned int        deconv_pad_y    = 0;
     const unsigned int  stride_x        = info.stride().first;
     const unsigned int  stride_y        = info.stride().second;
     const TensorShape   scale_out_shape = compute_deconvolution_upsampled_shape(*input, *weights, stride_x, stride_y, out_dims, deconv_pad_x, deconv_pad_y);
     TensorInfo          scale_out_info(input->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(scale_out_shape).set_data_layout(data_layout));
     const PadStrideInfo conv_info(1, 1, 0, 0, 0, 0, DimensionRoundingType::CEIL);
 
     ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionLayerUpsample::validate(input, &scale_out_info, info));
     ARM_COMPUTE_RETURN_ON_ERROR(CLConvolutionLayer::validate(&scale_out_info, weights, bias, output, conv_info, weights_info));
 
     return Status{};
 }

The documentation for this class was generated from the following files:

arm_compute/runtime/CL/functions/CLDirectDeconvolutionLayer.h
src/runtime/CL/functions/CLDirectDeconvolutionLayer.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ CLDirectDeconvolutionLayer() [1/3]

◆ CLDirectDeconvolutionLayer() [2/3]

◆ CLDirectDeconvolutionLayer() [3/3]

Member Function Documentation

◆ configure() [1/2]

◆ configure() [2/2]

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ prepare()

◆ run()

◆ validate()