24.02.1
|
Function to run the deconvolution layer through a call to GEMM.
More...
#include <CLGEMMDeconvolutionLayer.h>
|
| CLGEMMDeconvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr) |
| Constructor. More...
|
|
| CLGEMMDeconvolutionLayer (const CLGEMMDeconvolutionLayer &)=delete |
| Prevent instances of this class from being copied (As this class contains pointers) More...
|
|
| CLGEMMDeconvolutionLayer (CLGEMMDeconvolutionLayer &&)=default |
| Default move constructor. More...
|
|
CLGEMMDeconvolutionLayer & | operator= (const CLGEMMDeconvolutionLayer &)=delete |
| Prevent instances of this class from being copied (As this class contains pointers) More...
|
|
CLGEMMDeconvolutionLayer & | operator= (CLGEMMDeconvolutionLayer &&)=default |
| Default move assignment operator. More...
|
|
| ~CLGEMMDeconvolutionLayer () |
| Default desctructor. More...
|
|
void | configure (const ICLTensor *input, const ICLTensor *weights, const ICLTensor *bias, ICLTensor *output, const PadStrideInfo &deconv_info) |
| Set the input, weights, biases and output tensors. More...
|
|
void | configure (const CLCompileContext &compile_context, const ICLTensor *input, const ICLTensor *weights, const ICLTensor *bias, ICLTensor *output, const PadStrideInfo &deconv_info) |
| Set the input, weights, biases and output tensors. More...
|
|
void | run () override |
| Run the kernels contained in the function. More...
|
|
void | prepare () override |
| Prepare the function for executing. More...
|
|
virtual | ~IFunction ()=default |
| Destructor. More...
|
|
Function to run the deconvolution layer through a call to GEMM.
Deconvolution Layer is the backward pass of Convolution Layer. First we transform the input depending on the stride and pad info and then perform a 1x1 convolution pass. Input stride defines how many zeroes we should put between each element of the input, pad is the amount of padding and finally a is a user specified value where a < stride - 1, that increases the padding top and right of the input image.
The relation between input to output is as follows:
\[ width\_output = (width\_input - 1) \cdot stride\_x - 2 \cdot padding\_x + kernel\_x \]
\[ height\_output = (height\_input - 1) \cdot stride\_y - 2 \cdot padding\_y + kernel\_y \]
where: width_input is the size of the first input dimension. height_input is the size of the second input dimension. width_output is the size of the first output dimension. height_output is the size of the second output dimension. kernel_x and kernel_y are the convolution sizes in x and y. stride_x and stride_y is the input stride of the first and second dimension.
The weights used by Deconvolution are supposed to be the same as the ones used for Convolution.
This function calls the following OpenCL kernels/functions:
- CLGEMMLowpMatrixMultiplyCore
- CLGEMMLowpOutputStage
- CLPermute
- CLPermute
- CLReshapeLayer
- CLTranspose
- CLDeconvolutionReshapeOutputKernel
- CLSlice
Definition at line 81 of file CLGEMMDeconvolutionLayer.h.
◆ CLGEMMDeconvolutionLayer() [1/3]
Constructor.
Definition at line 102 of file CLGEMMDeconvolutionLayer.cpp.
103 : _memory_group(std::move(memory_manager)),
106 _gemmlowp_output_stage(),
107 _permute_input_to_nhwc(),
108 _permute_weights_to_nhwc(),
110 _transpose_weights(),
111 _deconv_reshape(std::make_unique<CLDeconvolutionReshapeOutputKernel>()),
115 _reshaped_weights_t(),
122 _padded_input(
false),
◆ CLGEMMDeconvolutionLayer() [2/3]
Prevent instances of this class from being copied (As this class contains pointers)
◆ CLGEMMDeconvolutionLayer() [3/3]
Default move constructor.
◆ ~CLGEMMDeconvolutionLayer()
◆ configure() [1/2]
Set the input, weights, biases and output tensors.
- Parameters
-
[in] | compile_context | The compile context to be used. |
[in,out] | input | Input tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC |
[in] | weights | The 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as input . Data layout supported: same as input . |
[in] | bias | (Optional) The biases have one dimension. Data type supported: Same as input . Data layout supported: same as input . |
[out] | output | Output tensor. The output has the same number of dimensions as the input . Data layout supported: same as input . |
[in] | deconv_info | Contains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. This function supports only stride_x = weights.width && stride_y = weights.height. Moreover, padding is not supported. |
Definition at line 262 of file CLGEMMDeconvolutionLayer.cpp.
271 input->info(), weights->info(),
bias !=
nullptr ?
bias->info() :
nullptr, output->info(), deconv_info));
274 _original_weights = weights;
275 _padded_input = deconv_info.pad_bottom() > 0 || deconv_info.pad_left() > 0 || deconv_info.pad_right() > 0 ||
276 deconv_info.pad_top() > 0;
280 const ICLTensor *input_to_use =
input;
281 const ICLTensor *weights_to_use = weights;
289 _memory_group.
manage(&_permuted_input);
294 input_to_use = &_permuted_input;
295 weights_to_use = &_permuted_weights;
300 TensorInfo(TensorShape(weights_to_use->info()->dimension(0), weights_to_use->info()->dimension(1) *
301 weights_to_use->info()->dimension(2) *
302 weights_to_use->info()->dimension(3)),
303 1,
input->info()->data_type(), weights->info()->quantization_info()));
305 _reshape_weights.
configure(compile_context, weights_to_use, &_reshaped_weights);
306 _transpose_weights.
configure(compile_context, &_reshaped_weights, &_reshaped_weights_t);
309 GEMMInfo gemm_info(
false,
false,
true,
input->info()->dimension(idx_h),
true);
316 QuantizationInfo iq_info =
input->info()->quantization_info();
317 QuantizationInfo wq_info = weights->info()->quantization_info();
319 input_to_use->info()->set_quantization_info(
320 QuantizationInfo(iq_info.uniform().scale, -iq_info.uniform().offset));
322 QuantizationInfo(wq_info.uniform().scale, -wq_info.uniform().offset));
324 _mm_gemmlowp.
configure(compile_context, input_to_use, &_reshaped_weights_t,
nullptr, &_gemm_output, gemm_info);
326 input_to_use->info()->set_quantization_info(iq_info);
331 _mm_gemm.
configure(compile_context, input_to_use, &_reshaped_weights_t,
nullptr, &_gemm_output, 1.f, 0.0f,
340 ICLTensor *deconv_reshape_output =
nullptr;
341 ICLTensor *slice_output =
nullptr;
342 ICLTensor *output_stage_output =
nullptr;
344 if (_padded_input && _is_quantized)
346 _memory_group.
manage(&_slice_gemm_input);
347 _memory_group.
manage(&_gemmlowp_final);
348 deconv_reshape_output = &_gemmlowp_final;
349 output_stage_output = &_slice_gemm_input;
350 slice_output = output;
352 else if (_padded_input)
354 _memory_group.
manage(&_slice_gemm_input);
355 deconv_reshape_output = &_slice_gemm_input;
356 slice_output = output;
358 else if (_is_quantized)
360 _memory_group.
manage(&_gemmlowp_final);
361 deconv_reshape_output = &_gemmlowp_final;
362 output_stage_output = output;
366 deconv_reshape_output = output;
370 _deconv_reshape->configure(compile_context, &_gemm_output,
bias, deconv_reshape_output,
input->info(),
371 weights->info(), deconv_info);
376 GEMMLowpOutputStageInfo output_stage_info;
377 construct_gemmlowp_output_stage(
input->info(), weights->info(), output->info(), output_stage_info);
378 _gemmlowp_output_stage.
configure(compile_context, &_gemmlowp_final,
nullptr, output_stage_output,
386 const auto start_end =
387 compute_start_end_slice_coordinates(*deconv_reshape_output->info(), deconv_info, _is_nchw);
388 _slice_gemm.
configure(compile_context, &_slice_gemm_input, slice_output, start_end.first, start_end.second);
References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, bias, CLReshapeLayer::configure(), CLTranspose::configure(), CLPermute::configure(), CLSlice::configure(), CLGEMMLowpOutputStage::configure(), CLGEMMLowpMatrixMultiplyCore::configure(), CLGEMM::configure(), ITensorInfo::dimension(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, ITensor::info(), CLTensor::info(), ITensorAllocator::init(), arm_compute::test::validation::input, arm_compute::is_data_type_quantized_asymmetric(), MemoryGroup::manage(), arm_compute::NCHW, UniformQuantizationInfo::offset, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), ITensorInfo::quantization_info(), UniformQuantizationInfo::scale, ITensorInfo::set_quantization_info(), TensorInfo::set_quantization_info(), arm_compute::utils::cast::U, QuantizationInfo::uniform(), and CLGEMMDeconvolutionLayer::validate().
◆ configure() [2/2]
Set the input, weights, biases and output tensors.
Valid data layouts:
Valid data type configurations:
src0 | src1 | src2 | dst |
F16 | F16 | F16 | F16 |
F32 | F32 | F32 | F32 |
QASYMM8 | QASYMM8 | S32 | QASYMM8 |
QASYMM8_SIGNED | QASYMM8_SIGNED | S32 | QASYMM8_SIGNED |
- Parameters
-
[in,out] | input | Input tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC |
[in] | weights | The 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as input . Data layout supported: same as input . |
[in] | bias | (Optional) The biases have one dimension. Data type supported: Same as input . Data layout supported: same as input . |
[out] | output | Output tensor. The output has the same number of dimensions as the input . Data layout supported: same as input . |
[in] | deconv_info | Contains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. This function supports only stride_x = weights.width && stride_y = weights.height. Moreover, padding is not supported. |
Definition at line 253 of file CLGEMMDeconvolutionLayer.cpp.
References bias, CLKernelLibrary::get(), and arm_compute::test::validation::input.
◆ operator=() [1/2]
Default move assignment operator.
◆ operator=() [2/2]
Prevent instances of this class from being copied (As this class contains pointers)
◆ prepare()
Prepare the function for executing.
Any one off pre-processing step required by the function is handled here
- Note
- Prepare stage might not need all the function's buffers' backing memory to be available in order to execute
Reimplemented from IFunction.
Definition at line 426 of file CLGEMMDeconvolutionLayer.cpp.
435 _permute_weights_to_nhwc.
run();
439 _reshape_weights.
run();
447 _transpose_weights.
run();
460 if (!_reshaped_weights_t.
is_used())
References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, CLTensorAllocator::free(), ITensor::is_used(), ITensor::mark_as_unused(), CLGEMM::prepare(), CLGEMMLowpMatrixMultiplyCore::prepare(), CLTranspose::run(), CLReshapeLayer::run(), and CLPermute::run().
Referenced by CLGEMMDeconvolutionLayer::run().
◆ run()
◆ validate()
Static function to check if given info will lead to a valid configuration of CLDeconvolutionLayer.
- Parameters
-
[in] | input | Input tensor info. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32. Data layout supported: NHWC |
[in] | weights | The 4d weights info with dimensions [width, height, IFM, OFM]. Data type supported: Same as input . Data layout supported: same as input . |
[in] | bias | (Optional) The biases have one dimension. Data type supported: Same as input . Data layout supported: same as input . |
[in] | output | Output tensor info. The output has the same number of dimensions as the input . Data layout supported: same as input . |
[in] | deconv_info | Contains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. |
- Returns
- a status
Definition at line 130 of file CLGEMMDeconvolutionLayer.cpp.
143 const bool padded_input = deconv_info.pad_bottom() > 0 || deconv_info.pad_left() > 0 ||
144 deconv_info.pad_right() > 0 || deconv_info.pad_top() > 0;
155 TensorShape nhwc_weights_shape = weights->tensor_shape();
156 TensorShape nhwc_input_shape =
input->tensor_shape();
163 TensorInfo nhwc_input_info =
input->clone()
164 ->set_is_resizable(
true)
166 .set_tensor_shape(nhwc_input_shape)
169 TensorInfo nhwc_weights_info = weights->clone()
170 ->set_is_resizable(
true)
172 .set_tensor_shape(nhwc_weights_shape)
179 const TensorShape reshaped_shape =
180 TensorShape(nhwc_weights_shape[0], nhwc_weights_shape[1] * nhwc_weights_shape[2] * nhwc_weights_shape[3]);
181 const TensorInfo reshaped_info =
182 weights->clone()->set_tensor_shape(reshaped_shape).set_data_layout(
DataLayout::NCHW).set_is_resizable(
true);
185 TensorShape transposed_shape(reshaped_shape[1], reshaped_shape[0]);
186 const TensorInfo reshaped_t_info = reshaped_info.clone()->set_is_resizable(
true).set_tensor_shape(transposed_shape);
189 TensorShape gemm_output_shape(weights->dimension(idx_w) * weights->dimension(idx_h) * weights->dimension(idx_b),
190 input->dimension(idx_w),
input->dimension(idx_h),
input->dimension(idx_b));
192 TensorInfo gemm_output_info = reshaped_t_info.clone()->set_tensor_shape(gemm_output_shape).set_is_resizable(
true);
193 GEMMInfo gemm_info(
false,
false,
true,
input->dimension(idx_h),
true);
195 GEMMLowpOutputStageInfo output_stage_info;
200 &
input->clone()->set_tensor_shape(nhwc_input_shape), &reshaped_t_info,
nullptr,
208 &reshaped_t_info,
nullptr, &gemm_output_info, 1.0f, 0.0f, gemm_info));
211 const PadStrideInfo stride_info(deconv_info.stride().first, deconv_info.stride().second);
213 weights->dimension(idx_w), weights->dimension(idx_h), stride_info);
214 const TensorShape deconv_shape =
216 TensorInfo col2im_output_info = gemm_output_info.clone()->set_tensor_shape(deconv_shape).set_is_resizable(
true);
218 if (padded_input && is_quantized)
220 const auto start_end = compute_start_end_slice_coordinates(col2im_output_info, deconv_info, is_nchw);
222 &gemm_output_info,
bias, &col2im_output_info,
input, weights, deconv_info));
224 &col2im_output_info,
nullptr,
225 &col2im_output_info.clone()->set_is_resizable(
true).set_data_type(
input->data_type()), output_stage_info));
228 output, start_end.first, start_end.second));
230 else if (padded_input)
232 const auto start_end = compute_start_end_slice_coordinates(col2im_output_info, deconv_info, is_nchw);
234 &gemm_output_info,
bias, &col2im_output_info,
input, weights, deconv_info));
237 else if (is_quantized)
240 &gemm_output_info,
bias, &col2im_output_info,
input, weights, deconv_info));
References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::BATCHES, bias, ICloneable< T >::clone(), TensorInfo::clone(), arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::cpu::data_layout, arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::input, arm_compute::is_data_type_quantized_asymmetric(), arm_compute::NCHW, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), arm_compute::permute(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::S32, TensorInfo::set_data_type(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), CLTranspose::validate(), CLReshapeLayer::validate(), CLPermute::validate(), CLDeconvolutionReshapeOutputKernel::validate(), CLSlice::validate(), CLGEMMLowpOutputStage::validate(), CLGEMM::validate(), CLGEMMLowpMatrixMultiplyCore::validate(), and arm_compute::WIDTH.
Referenced by CLGEMMDeconvolutionLayer::configure(), and CLDeconvolutionLayer::validate().
The documentation for this class was generated from the following files:
@ NCHW
Num samples, channels, height, width.
void run() override
Run the kernels contained in the function.
void configure(const ICLTensor *input, const ICLTensor *bias, ICLTensor *output, const GEMMLowpOutputStageInfo &info)
Initialise the kernel's inputs, output.
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLTranspose.
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, const GEMMInfo &gemm_info=GEMMInfo())
Static function to check if given info will lead to a valid configuration of CLGEMMLowpMatrixMultiply...
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
DataLayout
[DataLayout enum definition]
std::pair< unsigned int, unsigned int > deconvolution_output_dimensions(unsigned int in_width, unsigned int in_height, unsigned int kernel_width, unsigned int kernel_height, const PadStrideInfo &pad_stride_info)
Returns expected width and height of the deconvolution's output tensor.
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
@ QASYMM8
quantized, asymmetric fixed-point 8-bit number unsigned
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
void run() override
Run the kernels contained in the function.
void configure(const ICLTensor *input, const ICLTensor *weights, const ICLTensor *bias, ICLTensor *output, const PadStrideInfo &deconv_info)
Set the input, weights, biases and output tensors.
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const PermutationVector &perm)
Static function to check if given info will lead to a valid configuration of CLPermute.
void permute(Dimensions< T > &dimensions, const PermutationVector &perm)
Permutes given Dimensions according to a permutation vector.
void configure(const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel's inputs, output.
void prepare() override
Prepare the function for executing.
void run() override
Run the kernels contained in the function.
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
constexpr auto data_layout
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Strides PermutationVector
Permutation vector.
void configure(const ICLTensor *input, ICLTensor *output, const Coordinates &starts, const Coordinates &ends)
Configure kernel.
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
ITensorInfo & set_quantization_info(const QuantizationInfo &quantization_info) override
Set the quantization settings (scale and offset) of the tensor.
#define ARM_COMPUTE_ERROR_THROW_ON(status)
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const Coordinates &starts, const Coordinates &ends)
Static function to check if given info will lead to a valid configuration of CLSlice.
TensorShape compute_deconvolution_output_shape(const std::pair< unsigned int, unsigned int > &out_dims, const ITensorInfo &input, const ITensorInfo &weights)
Calculate the output shape of the deconvolution layer.
void run() override
Run the kernels contained in the function.
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Static function to check if given info will lead to a valid configuration of CLGEMM.
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
void run() override
Run the kernels contained in the function.
void mark_as_unused() const
Marks a tensor as unused.
static Status validate(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo &info)
Static function to check if given info will lead to a valid configuration of opencl::kernels::ClGemmL...
@ QASYMM8_SIGNED
quantized, asymmetric fixed-point 8-bit number signed
void prepare() override
Prepare the function for executing.
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(...)
void configure(const ICLTensor *input, ICLTensor *output)
Initialise the kernel's inputs and output.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
CLTensorAllocator * allocator()
Return a pointer to the tensor's allocator.
void configure(const ICLTensor *input, ICLTensor *output, const PermutationVector &perm)
Set the input and output tensors.
static CLScheduler & get()
Access the scheduler singleton.
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
size_t get_data_layout_dimension_index(const DataLayout &data_layout, const DataLayoutDimension &data_layout_dimension)
Get the index of the given dimension.
void configure(const CLCompileContext &compile_context, const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel's inputs and output.
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *bias, const ITensorInfo *output, const PadStrideInfo &deconv_info)
Static function to check if given info will lead to a valid configuration of CLDeconvolutionLayer.
bool is_used() const
Flags if the tensor is used or not.
static Status validate(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const ITensorInfo *input_info, const ITensorInfo *weights_info, const PadStrideInfo &deconv_info)
Static function to check if given info will lead to a valid configuration of CLDeconvolutionReshapeOu...
void prepare() override
Prepare the function for executing.
void free() override
Free allocated OpenCL memory.
@ F16
16-bit floating-point number
@ S32
signed 32-bit number
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
@ F32
32-bit floating-point number
void configure(const ICLTensor *input, ICLTensor *output)
Initialise the kernel's inputs and outputs.
#define ARM_COMPUTE_LOG_PARAMS(...)
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLReshapeLayer.