Compute Library
 21.02
CLGEMMDeconvolutionLayer Class Reference

Function to run the deconvolution layer through a call to GEMM. More...

#include <CLGEMMDeconvolutionLayer.h>

Collaboration diagram for CLGEMMDeconvolutionLayer:
[legend]

Public Member Functions

 CLGEMMDeconvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
 CLGEMMDeconvolutionLayer (const CLGEMMDeconvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLGEMMDeconvolutionLayer (CLGEMMDeconvolutionLayer &&)=default
 Default move constructor. More...
 
CLGEMMDeconvolutionLayeroperator= (const CLGEMMDeconvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLGEMMDeconvolutionLayeroperator= (CLGEMMDeconvolutionLayer &&)=default
 Default move assignment operator. More...
 
 ~CLGEMMDeconvolutionLayer ()
 Default desctructor. More...
 
void configure (const ICLTensor *input, const ICLTensor *weights, const ICLTensor *bias, ICLTensor *output, const PadStrideInfo &deconv_info)
 Set the input, weights, biases and output tensors. More...
 
void configure (const CLCompileContext &compile_context, const ICLTensor *input, const ICLTensor *weights, const ICLTensor *bias, ICLTensor *output, const PadStrideInfo &deconv_info)
 Set the input, weights, biases and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *bias, const ITensorInfo *output, const PadStrideInfo &deconv_info)
 Static function to check if given info will lead to a valid configuration of CLDeconvolutionLayer. More...
 

Detailed Description

Function to run the deconvolution layer through a call to GEMM.

Deconvolution Layer is the backward pass of Convolution Layer. First we transform the input depending on the stride and pad info and then perform a 1x1 convolution pass. Input stride defines how many zeroes we should put between each element of the input, pad is the amount of padding and finally a is a user specified value where a < stride - 1, that increases the padding top and right of the input image.

The relation between input to output is as follows:

\[ width\_output = (width\_input - 1) \cdot stride\_x - 2 \cdot padding\_x + kernel\_x \]

\[ height\_output = (height\_input - 1) \cdot stride\_y - 2 \cdot padding\_y + kernel\_y \]

where: width_input is the size of the first input dimension. height_input is the size of the second input dimension. width_output is the size of the first output dimension. height_output is the size of the second output dimension. kernel_x and kernel_y are the convolution sizes in x and y. stride_x and stride_y is the input stride of the first and second dimension.

The weights used by Deconvolution are supposed to be the same as the ones used for Convolution.

This function calls the following OpenCL kernels/functions:

  1. CLGEMMLowpMatrixMultiplyCore
  2. CLGEMMLowpOutputStage
  3. CLPermute
  4. CLPermute
  5. CLReshapeLayer
  6. CLTranspose
  7. CLDeconvolutionReshapeOutputKernel
  8. CLSlice

Definition at line 79 of file CLGEMMDeconvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLGEMMDeconvolutionLayer() [1/3]

CLGEMMDeconvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 107 of file CLGEMMDeconvolutionLayer.cpp.

108  : _memory_group(std::move(memory_manager)),
109  _mm_gemm(),
110  _mm_gemmlowp(),
111  _gemmlowp_output_stage(),
112  _permute_input_to_nhwc(),
113  _permute_weights_to_nhwc(),
114  _reshape_weights(),
115  _transpose_weights(),
116  _deconv_reshape(std::make_unique<CLDeconvolutionReshapeOutputKernel>()),
117  _slice_gemm(),
118  _gemmlowp_final(),
119  _reshaped_weights(),
120  _reshaped_weights_t(),
121  _permuted_input(),
122  _permuted_weights(),
123  _gemm_output(),
124  _slice_gemm_input(),
125  _original_weights(),
126  _is_prepared(false),
127  _padded_input(false),
128  _is_nchw(false),
129  _is_quantized(false)
130 {
131 }

◆ CLGEMMDeconvolutionLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLGEMMDeconvolutionLayer() [3/3]

Default move constructor.

◆ ~CLGEMMDeconvolutionLayer()

Default desctructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( const ICLTensor input,
const ICLTensor weights,
const ICLTensor bias,
ICLTensor output,
const PadStrideInfo deconv_info 
)

Set the input, weights, biases and output tensors.

Parameters
[in,out]inputInput tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC
[in]weightsThe 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as input. Data layout supported: same as input.
[in]bias(Optional) The biases have one dimension. Data type supported: Same as input. Data layout supported: same as input.
[out]outputOutput tensor. The output has the same number of dimensions as the input. Data layout supported: same as input.
[in]deconv_infoContains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. This function supports only stride_x = weights.width && stride_y = weights.height. Moreover, padding is not supported.

Definition at line 230 of file CLGEMMDeconvolutionLayer.cpp.

References CLKernelLibrary::get().

231 {
232  configure(CLKernelLibrary::get().get_compile_context(), input, weights, bias, output, deconv_info);
233 }
void configure(const ICLTensor *input, const ICLTensor *weights, const ICLTensor *bias, ICLTensor *output, const PadStrideInfo &deconv_info)
Set the input, weights, biases and output tensors.
static CLKernelLibrary & get()
Access the KernelLibrary singleton.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
const ICLTensor input,
const ICLTensor weights,
const ICLTensor bias,
ICLTensor output,
const PadStrideInfo deconv_info 
)

Set the input, weights, biases and output tensors.

Parameters
[in]compile_contextThe compile context to be used.
[in,out]inputInput tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC
[in]weightsThe 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as input. Data layout supported: same as input.
[in]bias(Optional) The biases have one dimension. Data type supported: Same as input. Data layout supported: same as input.
[out]outputOutput tensor. The output has the same number of dimensions as the input. Data layout supported: same as input.
[in]deconv_infoContains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. This function supports only stride_x = weights.width && stride_y = weights.height. Moreover, padding is not supported.

Definition at line 235 of file CLGEMMDeconvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, CLTranspose::configure(), CLReshapeLayer::configure(), CLPermute::configure(), CLGEMMLowpMatrixMultiplyCore::configure(), CLSlice::configure(), CLGEMM::configure(), CLGEMMLowpOutputStage::configure(), ITensorInfo::data_layout(), ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, ITensor::info(), CLTensor::info(), ITensorAllocator::init(), arm_compute::test::validation::input, arm_compute::is_data_type_quantized_asymmetric(), MemoryGroup::manage(), arm_compute::NCHW, UniformQuantizationInfo::offset, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), ITensorInfo::quantization_info(), UniformQuantizationInfo::scale, TensorInfo::set_quantization_info(), arm_compute::U, QuantizationInfo::uniform(), and CLGEMMDeconvolutionLayer::validate().

237 {
238  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
240  weights->info(),
241  bias != nullptr ? bias->info() : nullptr,
242  output->info(),
243  deconv_info));
244 
245  _original_weights = weights;
246  _padded_input = deconv_info.pad_bottom() > 0 || deconv_info.pad_left() > 0 || deconv_info.pad_right() > 0 || deconv_info.pad_top() > 0;
247  _is_nchw = input->info()->data_layout() == DataLayout::NCHW;
248  _is_quantized = is_data_type_quantized_asymmetric(input->info()->data_type());
249 
250  const ICLTensor *input_to_use = input;
251  const ICLTensor *weights_to_use = weights;
252 
253  // If the data layout is NCHW, transform everything in NHWC. Another alternative could be to
254  // do an outer product in NCHW and then an accumulation through a reduction. This would have two
255  // drawbacks: first, the outer product is less efficient than a full GEMM. Second, the reduction
256  // might be slower than GEMM.
257  if(_is_nchw)
258  {
259  _memory_group.manage(&_permuted_input);
260  _permute_input_to_nhwc.configure(compile_context, input, &_permuted_input, PermutationVector(2U, 0U, 1U));
261 
262  _permute_weights_to_nhwc.configure(compile_context, weights, &_permuted_weights, PermutationVector(2U, 0U, 1U));
263 
264  input_to_use = &_permuted_input;
265  weights_to_use = &_permuted_weights;
266  }
267 
268  // Reshape the input weights. The weights will be reshaped only once during the call to prepare()
269  _reshaped_weights.allocator()->init(TensorInfo(TensorShape(weights_to_use->info()->dimension(0),
270  weights_to_use->info()->dimension(1) * weights_to_use->info()->dimension(2) * weights_to_use->info()->dimension(3)),
271  1,
272  input->info()->data_type(), weights->info()->quantization_info()));
273 
274  _reshape_weights.configure(compile_context, weights_to_use, &_reshaped_weights);
275  _transpose_weights.configure(compile_context, &_reshaped_weights, &_reshaped_weights_t);
276 
277  const size_t idx_h = get_data_layout_dimension_index(input->info()->data_layout(), DataLayoutDimension::HEIGHT);
278  GEMMInfo gemm_info(false, false, true, input->info()->dimension(idx_h), true);
279 
280  // Configure output stage for asymmetric quantized types
281  if(_is_quantized)
282  {
283  // gemmlowp adds the offsets (instead of subtracting them). Thus, we need to negate the original
284  // and restore them back to make it work properly.
285  QuantizationInfo iq_info = input->info()->quantization_info();
286  QuantizationInfo wq_info = weights->info()->quantization_info();
287 
288  input_to_use->info()->set_quantization_info(QuantizationInfo(iq_info.uniform().scale, -iq_info.uniform().offset));
289  _reshaped_weights_t.info()->set_quantization_info(QuantizationInfo(wq_info.uniform().scale, -wq_info.uniform().offset));
290 
291  _mm_gemmlowp.configure(compile_context, input_to_use, &_reshaped_weights_t, nullptr, &_gemm_output, gemm_info);
292 
293  input_to_use->info()->set_quantization_info(iq_info);
294  _reshaped_weights_t.info()->set_quantization_info(wq_info);
295  }
296  else
297  {
298  _mm_gemm.configure(compile_context, input_to_use, &_reshaped_weights_t, nullptr, &_gemm_output, 1.f, 0.0f, gemm_info);
299  }
300 
301  if(_is_nchw)
302  {
303  _permuted_input.allocator()->allocate();
304  }
305 
306  ICLTensor *deconv_reshape_output = nullptr;
307  ICLTensor *slice_output = nullptr;
308  ICLTensor *output_stage_output = nullptr;
309 
310  if(_padded_input && _is_quantized)
311  {
312  _memory_group.manage(&_slice_gemm_input);
313  _memory_group.manage(&_gemmlowp_final);
314  deconv_reshape_output = &_gemmlowp_final;
315  output_stage_output = &_slice_gemm_input;
316  slice_output = output;
317  }
318  else if(_padded_input)
319  {
320  _memory_group.manage(&_slice_gemm_input);
321  deconv_reshape_output = &_slice_gemm_input;
322  slice_output = output;
323  }
324  else if(_is_quantized)
325  {
326  _memory_group.manage(&_gemmlowp_final);
327  deconv_reshape_output = &_gemmlowp_final;
328  output_stage_output = output;
329  }
330  else
331  {
332  deconv_reshape_output = output;
333  }
334 
335  // Configure a Col2Im call to reshape the output of GEMM
336  _deconv_reshape->configure(compile_context, &_gemm_output, bias, deconv_reshape_output, input->info(), weights->info(), deconv_info);
337  _gemm_output.allocator()->allocate();
338 
339  if(_is_quantized)
340  {
341  GEMMLowpOutputStageInfo output_stage_info;
342  construct_gemmlowp_output_stage(input->info(), weights->info(), output->info(), output_stage_info);
343  _gemmlowp_output_stage.configure(compile_context, &_gemmlowp_final, nullptr, output_stage_output, output_stage_info);
344  _gemmlowp_final.allocator()->allocate();
345  }
346 
347  // If the input was padded, the output needs to be sliced.
348  if(_padded_input)
349  {
350  const auto start_end = compute_start_end_slice_coordinates(*deconv_reshape_output->info(), deconv_info, _is_nchw);
351  _slice_gemm.configure(compile_context, &_slice_gemm_input, slice_output, start_end.first, start_end.second);
352  _slice_gemm_input.allocator()->allocate();
353  }
354 }
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor&#39;s metadata.
Definition: CLTensor.cpp:41
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *bias, const ITensorInfo *output, const PadStrideInfo &deconv_info)
Static function to check if given info will lead to a valid configuration of CLDeconvolutionLayer.
Strides PermutationVector
Permutation vector.
Definition: Types.h:49
void configure(const ICLTensor *input, const ICLTensor *bias, ICLTensor *output, const GEMMLowpOutputStageInfo &info)
Initialise the kernel&#39;s inputs, output.
CLTensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: CLTensor.cpp:61
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
void configure(const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel&#39;s inputs, output.
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
ITensorInfo & set_quantization_info(const QuantizationInfo &quantization_info) override
Set the quantization settings (scale and offset) of the tensor.
Definition: TensorInfo.cpp:380
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
Num samples, channels, height, width.
void configure(const ICLTensor *input, ICLTensor *output, const Coordinates &starts, const Coordinates &ends)
Configure kernel.
Definition: CLSlice.cpp:84
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1190
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
void configure(const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel&#39;s inputs and output.
Definition: CLGEMM.cpp:666
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
void configure(const ICLTensor *input, ICLTensor *output, const PermutationVector &perm)
Set the input and output tensors.
Definition: CLPermute.cpp:50
void configure(const ICLTensor *input, ICLTensor *output)
Initialise the kernel&#39;s inputs and outputs.
void configure(const ICLTensor *input, ICLTensor *output)
Initialise the kernel&#39;s inputs and output.
Definition: CLTranspose.cpp:32
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193

◆ operator=() [1/2]

CLGEMMDeconvolutionLayer& operator= ( const CLGEMMDeconvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 389 of file CLGEMMDeconvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, CLTensorAllocator::free(), ITensor::is_used(), ITensor::mark_as_unused(), CLGEMMLowpMatrixMultiplyCore::prepare(), CLGEMM::prepare(), ICLSimpleFunction::run(), CLReshapeLayer::run(), and CLPermute::run().

Referenced by CLGEMMDeconvolutionLayer::run().

390 {
391  if(!_is_prepared)
392  {
393  ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
394 
395  if(_is_nchw)
396  {
397  _permuted_weights.allocator()->allocate();
398  _permute_weights_to_nhwc.run();
399  }
400 
401  _reshaped_weights.allocator()->allocate();
402  _reshape_weights.run();
403 
404  if(_is_nchw)
405  {
406  _permuted_weights.allocator()->free();
407  }
408 
409  _reshaped_weights_t.allocator()->allocate();
410  _transpose_weights.run();
411 
412  // Prepare gemm
413  if(!_is_quantized)
414  {
415  _mm_gemm.prepare();
416  }
417  else
418  {
419  _mm_gemmlowp.prepare();
420  }
421 
422  // Free resources
423  if(!_reshaped_weights_t.is_used())
424  {
425  _reshaped_weights_t.allocator()->free();
426  }
427 
428  _original_weights->mark_as_unused();
429  _is_prepared = true;
430  }
431 }
void prepare() override
Prepare the function for executing.
Definition: CLGEMM.cpp:870
void prepare() override
Prepare the function for executing.
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:163
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
CLTensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: CLTensor.cpp:61
void run() override
Run the kernels contained in the function.
Definition: CLPermute.cpp:71
void run() override
Run the kernels contained in the function.
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:168
void run() override final
Run the kernels contained in the function.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
void free() override
Free allocated OpenCL memory.

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 356 of file CLGEMMDeconvolutionLayer.cpp.

References CLScheduler::enqueue(), CLScheduler::get(), CLGEMMDeconvolutionLayer::prepare(), ICLSimpleFunction::run(), CLPermute::run(), CLGEMMLowpMatrixMultiplyCore::run(), CLSlice::run(), and CLGEMM::run().

357 {
358  prepare();
359 
360  MemoryGroupResourceScope scope_mg(_memory_group);
361 
362  if(_is_nchw)
363  {
364  _permute_input_to_nhwc.run();
365  }
366 
367  if(_is_quantized)
368  {
369  _mm_gemmlowp.run();
370  }
371  else
372  {
373  _mm_gemm.run();
374  }
375 
376  CLScheduler::get().enqueue(*_deconv_reshape, false);
377 
378  if(_is_quantized)
379  {
380  _gemmlowp_output_stage.run();
381  }
382 
383  if(_padded_input)
384  {
385  _slice_gemm.run();
386  }
387 }
void run() override
Run the kernels contained in the function.
Definition: CLGEMM.cpp:778
static CLScheduler & get()
Access the scheduler singleton.
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
Definition: CLPermute.cpp:71
void run() override final
Run the kernels contained in the function.
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
void run() override
Run the kernels contained in the function.
Definition: CLSlice.cpp:97
void prepare() override
Prepare the function for executing.

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo bias,
const ITensorInfo output,
const PadStrideInfo deconv_info 
)
static

Static function to check if given info will lead to a valid configuration of CLDeconvolutionLayer.

Parameters
[in]inputInput tensor info. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC
[in]weightsThe 4d weights info with dimensions [width, height, IFM, OFM]. Data type supported: Same as input. Data layout supported: same as input.
[in]bias(Optional) The biases have one dimension. Data type supported: Same as input. Data layout supported: same as input.
[in]outputOutput tensor info. The output has the same number of dimensions as the input. Data layout supported: same as input.
[in]deconv_infoContains padding and policies to be used in the deconvolution, this is described in PadStrideInfo.
Returns
a status

Definition at line 135 of file CLGEMMDeconvolutionLayer.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::BATCHES, ICloneable< T >::clone(), TensorInfo::clone(), arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::is_data_type_quantized_asymmetric(), arm_compute::NCHW, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), arm_compute::permute(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::S32, TensorInfo::set_data_type(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), CLTranspose::validate(), CLReshapeLayer::validate(), CLPermute::validate(), CLDeconvolutionReshapeOutputKernel::validate(), CLGEMMLowpMatrixMultiplyCore::validate(), CLSlice::validate(), CLGEMM::validate(), CLGEMMLowpOutputStage::validate(), and arm_compute::WIDTH.

Referenced by CLGEMMDeconvolutionLayer::configure(), and CLDeconvolutionLayer::validate().

136 {
137  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(input, weights, output);
141 
142  DataLayout data_layout = input->data_layout();
143  const bool padded_input = deconv_info.pad_bottom() > 0 || deconv_info.pad_left() > 0 || deconv_info.pad_right() > 0 || deconv_info.pad_top() > 0;
144  const bool is_nchw = input->data_layout() == DataLayout::NCHW;
145  const bool is_quantized = is_data_type_quantized_asymmetric(input->data_type());
146 
147  const size_t idx_w = get_data_layout_dimension_index(data_layout, DataLayoutDimension::WIDTH);
148  const size_t idx_h = get_data_layout_dimension_index(data_layout, DataLayoutDimension::HEIGHT);
149  const size_t idx_b = get_data_layout_dimension_index(data_layout, DataLayoutDimension::BATCHES);
150 
151  ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_w) != deconv_info.stride().first);
152  ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_h) != deconv_info.stride().second);
153 
154  TensorShape nhwc_weights_shape = weights->tensor_shape();
155  TensorShape nhwc_input_shape = input->tensor_shape();
156 
157  if(is_nchw)
158  {
159  permute(nhwc_weights_shape, PermutationVector(2, 0, 1));
160  permute(nhwc_input_shape, PermutationVector(2, 0, 1));
161 
162  TensorInfo nhwc_input_info = input->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(nhwc_input_shape).set_data_layout(DataLayout::NCHW);
163 
164  TensorInfo nhwc_weights_info = weights->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(nhwc_weights_shape).set_data_layout(DataLayout::NCHW);
165 
166  CLPermute::validate(weights, &nhwc_weights_info, PermutationVector(2, 0, 1));
167  CLPermute::validate(input, &nhwc_input_info, PermutationVector(2, 0, 1));
168  }
169 
170  const TensorShape reshaped_shape = TensorShape(nhwc_weights_shape[0], nhwc_weights_shape[1] * nhwc_weights_shape[2] * nhwc_weights_shape[3]);
171  const TensorInfo reshaped_info = weights->clone()->set_tensor_shape(reshaped_shape).set_data_layout(DataLayout::NCHW).set_is_resizable(true);
172  ARM_COMPUTE_RETURN_ON_ERROR(CLReshapeLayer::validate(weights, &reshaped_info));
173 
174  TensorShape transposed_shape(reshaped_shape[1], reshaped_shape[0]);
175  const TensorInfo reshaped_t_info = reshaped_info.clone()->set_is_resizable(true).set_tensor_shape(transposed_shape);
176  ARM_COMPUTE_RETURN_ON_ERROR(CLTranspose::validate(&reshaped_info, &reshaped_t_info));
177 
178  TensorShape gemm_output_shape(weights->dimension(idx_w) * weights->dimension(idx_h) * weights->dimension(idx_b),
179  input->dimension(idx_w),
180  input->dimension(idx_h),
181  input->dimension(idx_b));
182 
183  TensorInfo gemm_output_info = reshaped_t_info.clone()->set_tensor_shape(gemm_output_shape).set_is_resizable(true);
184  GEMMInfo gemm_info(false, false, true, input->dimension(idx_h), true);
185 
186  GEMMLowpOutputStageInfo output_stage_info;
187 
188  if(is_quantized)
189  {
190  ARM_COMPUTE_RETURN_ON_ERROR(CLGEMMLowpMatrixMultiplyCore::validate(&input->clone()->set_tensor_shape(nhwc_input_shape), &reshaped_t_info, nullptr, &gemm_output_info.set_data_type(DataType::S32),
191  gemm_info));
192  ARM_COMPUTE_RETURN_ON_ERROR(construct_gemmlowp_output_stage(input, weights, output, output_stage_info));
193  }
194  else
195  {
196  ARM_COMPUTE_RETURN_ON_ERROR(CLGEMM::validate(&input->clone()->set_tensor_shape(nhwc_input_shape).set_is_resizable(true), &reshaped_t_info, nullptr, &gemm_output_info, 1.0f, 0.0f, gemm_info));
197  }
198 
199  const PadStrideInfo stride_info(deconv_info.stride().first, deconv_info.stride().second);
200  auto out_dims = deconvolution_output_dimensions(input->dimension(idx_w), input->dimension(idx_h), weights->dimension(idx_w), weights->dimension(idx_h), stride_info);
201  const TensorShape deconv_shape = misc::shape_calculator::compute_deconvolution_output_shape(out_dims, *input, *weights);
202  TensorInfo col2im_output_info = gemm_output_info.clone()->set_tensor_shape(deconv_shape).set_is_resizable(true);
203 
204  if(padded_input && is_quantized)
205  {
206  const auto start_end = compute_start_end_slice_coordinates(col2im_output_info, deconv_info, is_nchw);
207  ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionReshapeOutputKernel::validate(&gemm_output_info, bias, &col2im_output_info, input, weights, deconv_info));
208  ARM_COMPUTE_RETURN_ON_ERROR(CLGEMMLowpOutputStage::validate(&col2im_output_info, nullptr, &col2im_output_info.clone()->set_is_resizable(true).set_data_type(input->data_type()), output_stage_info));
209  ARM_COMPUTE_RETURN_ON_ERROR(CLSlice::validate(&col2im_output_info.clone()->set_is_resizable(true).set_data_type(input->data_type()), output, start_end.first, start_end.second));
210  }
211  else if(padded_input)
212  {
213  const auto start_end = compute_start_end_slice_coordinates(col2im_output_info, deconv_info, is_nchw);
214  ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionReshapeOutputKernel::validate(&gemm_output_info, bias, &col2im_output_info, input, weights, deconv_info));
215  ARM_COMPUTE_RETURN_ON_ERROR(CLSlice::validate(&col2im_output_info, output, start_end.first, start_end.second));
216  }
217  else if(is_quantized)
218  {
219  ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionReshapeOutputKernel::validate(&gemm_output_info, bias, &col2im_output_info, input, weights, deconv_info));
220  ARM_COMPUTE_RETURN_ON_ERROR(CLGEMMLowpOutputStage::validate(&col2im_output_info, nullptr, output, output_stage_info));
221  }
222  else
223  {
224  ARM_COMPUTE_RETURN_ON_ERROR(CLDeconvolutionReshapeOutputKernel::validate(&gemm_output_info, bias, output, input, weights, deconv_info));
225  }
226 
227  return Status{};
228 }
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLReshapeLayer.
static Status validate(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const ITensorInfo *input_info, const ITensorInfo *weights_info, const PadStrideInfo &deconv_info)
Static function to check if given info will lead to a valid configuration of CLDeconvolutionReshapeOu...
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(...)
Definition: Validate.h:494
std::pair< unsigned int, unsigned int > deconvolution_output_dimensions(unsigned int in_width, unsigned int in_height, unsigned int kernel_width, unsigned int kernel_height, const PadStrideInfo &pad_stride_info)
Returns expected width and height of the deconvolution&#39;s output tensor.
Definition: Utils.cpp:399
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
1 channel, 1 F32 per channel
Strides PermutationVector
Permutation vector.
Definition: Types.h:49
const DataLayout data_layout
Definition: Im2Col.cpp:151
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
1 channel, 1 F16 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
void permute(Dimensions< T > &dimensions, const PermutationVector &perm)
Permutes given Dimensions according to a permutation vector.
Definition: Helpers.h:125
static Status validate(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo &info)
Static function to check if given info will lead to a valid configuration of CLGEMMLowpQuantizeDownIn...
TensorShape compute_deconvolution_output_shape(const std::pair< unsigned int, unsigned int > &out_dims, const ITensorInfo &input, const ITensorInfo &weights)
Calculate the output shape of the deconvolution layer.
1 channel, 1 S32 per channel
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, const GEMMInfo &gemm_info=GEMMInfo())
Static function to check if given info will lead to a valid configuration of CLGEMMLowpMatrixMultiply...
quantized, asymmetric fixed-point 8-bit number unsigned
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const Coordinates &starts, const Coordinates &ends)
Static function to check if given info will lead to a valid configuration of CLSlice.
Definition: CLSlice.cpp:79
Num samples, channels, height, width.
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1190
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLTranspose.
Definition: CLTranspose.cpp:44
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:792
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Static function to check if given info will lead to a valid configuration of CLGEMM.
Definition: CLGEMM.cpp:727
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const PermutationVector &perm)
Static function to check if given info will lead to a valid configuration of CLPermute.
Definition: CLPermute.cpp:66
quantized, asymmetric fixed-point 8-bit number signed
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
DataLayout
[DataLayout enum definition]
Definition: Types.h:120

The documentation for this class was generated from the following files: