Compute Library
 19.08
CLDirectDeconvolutionLayer Class Reference

Function to run the deconvolution layer. More...

#include <CLDirectDeconvolutionLayer.h>

Collaboration diagram for CLDirectDeconvolutionLayer:
[legend]

Public Member Functions

 CLDirectDeconvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
 CLDirectDeconvolutionLayer (const CLDirectDeconvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLDirectDeconvolutionLayer (CLDirectDeconvolutionLayer &&)=default
 Default move constructor. More...
 
CLDirectDeconvolutionLayeroperator= (const CLDirectDeconvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLDirectDeconvolutionLayeroperator= (CLDirectDeconvolutionLayer &&)=default
 Default move assignment operator. More...
 
void configure (ICLTensor *input, ICLTensor *weights, const ICLTensor *bias, ICLTensor *output, const PadStrideInfo &info, const WeightsInfo &weights_info=WeightsInfo())
 Set the input, weights, biases and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *bias, ITensorInfo *output, const PadStrideInfo &info, const WeightsInfo &weights_info=WeightsInfo())
 Static function to check if given info will lead to a valid configuration of CLDirectDeconvolutionLayer. More...
 

Detailed Description

Function to run the deconvolution layer.

Deconvolution Layer is the backward pass of Convolution Layer. First we transform the input depending on the stride and pad info and then perform a 1x1 convolution pass. Input stride defines how many zeroes we should put between each element of the input and pad is the amount of padding.

The relation between input to output is as follows:

\[ width\_output = (width\_input - 1) \cdot stride\_x - 2 \cdot padding\_x + kernel\_x \]

\[ height\_output = (height\_input - 1) \cdot stride\_y - 2 \cdot padding\_y + kernel\_y \]

where: width_input is the size of the first input dimension. height_input is the size of the second input dimension. width_output is the size of the first output dimension. height_output is the size of the second output dimension. kernel_x and kernel_y are the convolution sizes in x and y. stride_x and stride_y is the input stride of the first and second dimension.

The weights used by Deconvolution are supposed to be the same as the ones used for Convolution. Therefore, it will be necessary to use the weights in the reverse order to perform an actual convolution. This is achieved by using the CPPFlipWeightsKernel.

This function calls the following OpenCL kernels/functions:

  1. CLDeconvolutionLayerUpsample
  2. CLConvolutionLayer

And the following CPP kernels:

  1. CLReverse

Definition at line 75 of file CLDirectDeconvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLDirectDeconvolutionLayer() [1/3]

CLDirectDeconvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 39 of file CLDirectDeconvolutionLayer.cpp.

40  : _memory_group(std::move(memory_manager)),
41  _scale_f(),
42  _conv_f(),
43  _flip_weights(),
44  _scaled_output(),
45  _original_weights(nullptr),
46  _weights_flipped(),
47  _flip_axis(),
48  _is_prepared(false)
49 {
50 }

◆ CLDirectDeconvolutionLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLDirectDeconvolutionLayer() [3/3]

Default move constructor.

Member Function Documentation

◆ configure()

void configure ( ICLTensor input,
ICLTensor weights,
const ICLTensor bias,
ICLTensor output,
const PadStrideInfo info,
const WeightsInfo weights_info = WeightsInfo() 
)

Set the input, weights, biases and output tensors.

Parameters
[in,out]inputInput tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/F16/F32.
[in]weightsThe 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as input.
[in]bias(Optional) The biases have one dimension. Data type supported: Same as input.
[out]outputOutput tensor. The output has the same number of dimensions as the input.
[in]infoContains padding and policies to be used in the deconvolution, this is decribed in PadStrideInfo.
[in]weights_info(Optional) Weights information needed for CLConvolutionLayer, specifies if the weights tensor has been reshaped with CLWeightsReshapeKernel.

Definition at line 107 of file CLDirectDeconvolutionLayer.cpp.

109 {
110  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
111 
112  const unsigned int stride_x = info.stride().first;
113  const unsigned int stride_y = info.stride().second;
114 
115  const DataLayout data_layout = input->info()->data_layout();
116 
119 
120  _original_weights = weights;
121  _flip_axis.allocator()->init(TensorInfo(TensorShape(2U), 1, DataType::U32));
122  _weights_flipped.allocator()->init(weights->info()->clone()->set_data_layout(data_layout));
123  _flip_weights.configure(weights, &_weights_flipped, &_flip_axis);
124 
125  auto out_dims = deconvolution_output_dimensions(input->info()->dimension(idx_w), input->info()->dimension(idx_h), weights->info()->dimension(idx_w), weights->info()->dimension(idx_h),
126  info.pad().first, info.pad().second, stride_x, stride_y);
127 
128  const TensorShape output_shape = compute_deconvolution_output_shape(out_dims, *input->info(), *weights->info());
129 
130  // Output auto initialization if not yet initialized
131  auto_init_if_empty(*output->info(), input->info()->clone()->set_tensor_shape(output_shape).set_data_layout(data_layout));
132 
133  // Perform validation step
134  ARM_COMPUTE_ERROR_THROW_ON(CLDirectDeconvolutionLayer::validate(input->info(), weights->info(), bias == nullptr ? nullptr : bias->info(), output->info(), info));
135 
136  _is_prepared = weights_info.retain_internal_weights();
137 
138  _memory_group.manage(&_scaled_output);
139 
140  // Find the upsampled dimensions and the padding needed for the convolution with stride 1 in order to match output shape
141  unsigned int padx = 0;
142  unsigned int pady = 0;
143  const TensorShape scale_out_shape = compute_deconvolution_upsampled_shape(*input->info(), *weights->info(), stride_x, stride_y, out_dims, padx, pady);
144 
145  TensorInfo scale_out_info(scale_out_shape, 1, input->info()->data_type(), input->info()->quantization_info());
146  scale_out_info.set_data_layout(data_layout);
147  _scaled_output.allocator()->init(scale_out_info);
148 
149  // configure scale function
150  const PadStrideInfo upsample_info(stride_x, stride_y, padx / 2, pady / 2);
151  _scale_f.configure(input, &_scaled_output, upsample_info);
152 
153  // Setup the function to convolve the upscaled output
154  const PadStrideInfo conv_info(1, 1, 0, 0, 0, 0, DimensionRoundingType::CEIL);
155  _conv_f.configure(&_scaled_output, &_weights_flipped, bias, output, conv_info, weights_info);
156  _scaled_output.allocator()->allocate();
157 
158  // Setup flip axis data
159  _flip_axis.allocator()->allocate();
160  _flip_axis.map(true);
161  auto axis_data = reinterpret_cast<uint32_t *>(_flip_axis.buffer());
163  {
164  axis_data[0] = 1;
165  axis_data[1] = 2;
166  }
167  else
168  {
169  axis_data[0] = 0;
170  axis_data[1] = 1;
171  }
172  _flip_axis.unmap();
173 }
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *bias, ITensorInfo *output, const PadStrideInfo &info, const WeightsInfo &weights_info=WeightsInfo())
Static function to check if given info will lead to a valid configuration of CLDirectDeconvolutionLay...
const DataLayout data_layout
Definition: Im2Col.cpp:146
std::unique_ptr< ITensorInfo > clone() const override
Provide a clone of the current object of class T.
Definition: TensorInfo.cpp:306
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: CLTensor.cpp:35
void configure(ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1)
Set the input and output tensors.
void configure(ICLTensor *input, ICLTensor *output, const PadStrideInfo &info)
Initialize the function's source, destination, interpolation type and border_mode.
DataLayout data_layout() const override
Get the data layout of the tensor.
Definition: TensorInfo.h:297
size_t dimension(size_t index) const override
Return the size of the requested dimension.
Definition: TensorInfo.h:223
CLTensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: CLTensor.cpp:55
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Definition: Helpers.inl:201
void map(bool blocking=true)
Enqueue a map operation of the allocated buffer.
Definition: CLTensor.cpp:60
TensorShape compute_deconvolution_output_shape(const std::pair< unsigned int, unsigned int > &out_dims, const ITensorInfo &input, const ITensorInfo &weights)
Calculate the output shape of the deconvolution layer.
uint8_t * buffer() const override
Interface to be implemented by the child class to return a pointer to CPU memory.
Definition: ICLTensor.cpp:53
void manage(TensorType *obj)
Sets a object to be managed by the given memory group.
1 channel, 1 U32 per channel
TensorShape compute_deconvolution_upsampled_shape(const ITensorInfo &input, const ITensorInfo &weights, unsigned int sx, unsigned int sy, std::pair< unsigned int, unsigned int > &out_dims, unsigned int &padx, unsigned int &pady)
Calculate the upsampled output shape used for deconvolution.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
std::pair< unsigned int, unsigned int > deconvolution_output_dimensions(unsigned int in_width, unsigned int in_height, unsigned int kernel_width, unsigned int kernel_height, unsigned int padx, unsigned int pady, unsigned int stride_x, unsigned int stride_y)
Returns expected width and height of the deconvolution's output tensor.
Definition: Utils.cpp:374
Num samples, height, width, channels.
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:326
void unmap()
Enqueue an unmap operation of the allocated and mapped buffer.
Definition: CLTensor.cpp:65
DataLayout
[DataLayout enum definition]
Definition: Types.h:114
void configure(const ICLTensor *input, ICLTensor *output, const ICLTensor *axis)
Initialize the function.
Definition: CLReverse.cpp:32

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::test::validation::bias, ICLTensor::buffer(), arm_compute::CEIL, ICloneable< T >::clone(), TensorInfo::clone(), arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::misc::shape_calculator::compute_deconvolution_upsampled_shape(), CLReverse::configure(), CLDeconvolutionLayerUpsample::configure(), CLConvolutionLayer::configure(), arm_compute::test::validation::conv_info, arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), TensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), TensorInfo::dimension(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, ITensor::info(), CLTensor::info(), arm_compute::test::validation::info, ITensorAllocator::init(), MemoryGroupBase< TensorType >::manage(), CLTensor::map(), arm_compute::NHWC, arm_compute::test::validation::output_shape, ITensorInfo::quantization_info(), TensorInfo::set_data_layout(), arm_compute::U, arm_compute::U32, CLTensor::unmap(), CLDirectDeconvolutionLayer::validate(), arm_compute::test::validation::weights, arm_compute::test::validation::weights_info, and arm_compute::WIDTH.

◆ operator=() [1/2]

CLDirectDeconvolutionLayer& operator= ( const CLDirectDeconvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 185 of file CLDirectDeconvolutionLayer.cpp.

186 {
187  if(!_is_prepared)
188  {
189  ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
190 
191  // Run weights flipping and mark original weights tensor as unused
192  _weights_flipped.allocator()->allocate();
193  _flip_weights.run();
194  _original_weights->mark_as_unused();
195 
196  // Prepare convolution
197  _conv_f.prepare();
198 
199  // Free flipped weights
200  if(!_weights_flipped.is_used())
201  {
202  _weights_flipped.allocator()->free();
203  }
204 
205  _is_prepared = true;
206  }
207 }
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:162
void prepare() override
Prepare the function for executing.
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:337
CLTensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: CLTensor.cpp:55
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:167
void run() override final
Run the kernels contained in the function.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
void free() override
Free allocated OpenCL memory.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, CLTensorAllocator::free(), ITensor::is_used(), ITensor::mark_as_unused(), CLConvolutionLayer::prepare(), and ICLSimpleFunction::run().

Referenced by CLDirectDeconvolutionLayer::run().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 175 of file CLDirectDeconvolutionLayer.cpp.

176 {
177  prepare();
178 
179  MemoryGroupResourceScope scope_mg(_memory_group);
180 
181  _scale_f.run();
182  _conv_f.run();
183 }
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
void prepare() override
Prepare the function for executing.

References CLDirectDeconvolutionLayer::prepare(), CLDeconvolutionLayerUpsample::run(), and CLConvolutionLayer::run().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo bias,
ITensorInfo output,
const PadStrideInfo info,
const WeightsInfo weights_info = WeightsInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CLDirectDeconvolutionLayer.

Parameters
[in]inputInput tensor info. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/F16/F32.
[in]weightsThe 4d weights info with dimensions [width, height, IFM, OFM]. Data type supported: Same as input.
[in]bias(Optional) The biases have one dimension. Data type supported: Same as input.
[in]outputOutput tensor info. The output has the same number of dimensions as the input.
[in]infoContains padding and policies to be used in the deconvolution, this is decribed in PadStrideInfo.
[in]weights_info(Optional) Weights information needed for CLConvolutionLayer, specifies if the weights tensor has been reshaped with CLWeightsReshapeKernel.
Returns
a status

Definition at line 52 of file CLDirectDeconvolutionLayer.cpp.

54 {
58  const DataLayout data_layout = input->data_layout();
59 
63 
64  ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_w) != weights->dimension(idx_h));
65  ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_w) < 1);
66  ARM_COMPUTE_RETURN_ERROR_ON(!info.padding_is_symmetric());
67 
68  const unsigned int stride_x = info.stride().first;
69  const unsigned int stride_y = info.stride().second;
70 
71  auto out_dims = deconvolution_output_dimensions(input->dimension(idx_w), input->dimension(idx_h), weights->dimension(idx_w), weights->dimension(idx_h),
72  info.pad().first, info.pad().second, stride_x, stride_y);
73 
74  const TensorShape output_shape = compute_deconvolution_output_shape(out_dims, *input, *weights);
75 
77 
78  if(bias != nullptr)
79  {
80  if(is_data_type_quantized_asymmetric(input->data_type()))
81  {
83  }
84  else
85  {
87  }
89  }
90 
91  ARM_COMPUTE_RETURN_ERROR_ON_MSG(output->dimension(idx_w) != output_shape[idx_w], "Output's width is invalid.");
92  ARM_COMPUTE_RETURN_ERROR_ON_MSG(output->dimension(idx_h) != output_shape[idx_h], "Output's height is invalid.");
93  ARM_COMPUTE_RETURN_ERROR_ON_MSG(output->dimension(idx_c) != output_shape[idx_c], "Output's depth is invalid.");
94 
95  unsigned int padx = 0;
96  unsigned int pady = 0;
97  const TensorShape scale_out_shape = compute_deconvolution_upsampled_shape(*input, *weights, stride_x, stride_y, out_dims, padx, pady);
98  TensorInfo scale_out_info(input->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(scale_out_shape).set_data_layout(data_layout));
99  const PadStrideInfo conv_info(1, 1, 0, 0, 0, 0, DimensionRoundingType::CEIL);
100 
103 
104  return Status{};
105 }
const DataLayout data_layout
Definition: Im2Col.cpp:146
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(...)
Definition: Validate.h:494
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:791
1 channel, 1 F32 per channel
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false, unsigned int num_groups=1)
Static function to check if given info will lead to a valid configuration of CLConvolutionLayer.
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:244
1 channel, 1 F16 per channel
TensorShape compute_deconvolution_output_shape(const std::pair< unsigned int, unsigned int > &out_dims, const ITensorInfo &input, const ITensorInfo &weights)
Calculate the output shape of the deconvolution layer.
1 channel, 1 S32 per channel
quantized, asymmetric fixed-point 8-bit number
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond,...)
If the condition is true, an error is returned.
Definition: Error.h:214
TensorShape compute_deconvolution_upsampled_shape(const ITensorInfo &input, const ITensorInfo &weights, unsigned int sx, unsigned int sy, std::pair< unsigned int, unsigned int > &out_dims, unsigned int &padx, unsigned int &pady)
Calculate the upsampled output shape used for deconvolution.
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1030
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
std::pair< unsigned int, unsigned int > deconvolution_output_dimensions(unsigned int in_width, unsigned int in_height, unsigned int kernel_width, unsigned int kernel_height, unsigned int padx, unsigned int pady, unsigned int stride_x, unsigned int stride_y)
Returns expected width and height of the deconvolution's output tensor.
Definition: Utils.cpp:374
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const PadStrideInfo &info)
Static function to check if given info will lead to a valid configuration of CLDeconvolutionLayerUpsa...
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:326
DataLayout
[DataLayout enum definition]
Definition: Types.h:114

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::test::validation::bias, arm_compute::CEIL, arm_compute::CHANNEL, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::misc::shape_calculator::compute_deconvolution_upsampled_shape(), arm_compute::test::validation::conv_info, arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::info, arm_compute::is_data_type_quantized_asymmetric(), arm_compute::test::validation::output_shape, arm_compute::QASYMM8, arm_compute::S32, CLDeconvolutionLayerUpsample::validate(), CLConvolutionLayer::validate(), arm_compute::test::validation::weights, arm_compute::test::validation::weights_info, and arm_compute::WIDTH.

Referenced by CLDirectDeconvolutionLayer::configure(), and CLDeconvolutionLayer::validate().


The documentation for this class was generated from the following files: