24.02.1
|
Function to run the deconvolution layer. More...
#include <NEDeconvolutionLayer.h>
Public Member Functions | |
NEDeconvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr) | |
Constructor. More... | |
NEDeconvolutionLayer (const NEDeconvolutionLayer &)=delete | |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
NEDeconvolutionLayer (NEDeconvolutionLayer &&)=default | |
Default move constructor. More... | |
NEDeconvolutionLayer & | operator= (const NEDeconvolutionLayer &)=delete |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
NEDeconvolutionLayer & | operator= (NEDeconvolutionLayer &&)=default |
Default move assignment operator. More... | |
~NEDeconvolutionLayer ()=default | |
Default destructor. More... | |
void | configure (ITensor *input, const ITensor *weights, const ITensor *bias, ITensor *output, const PadStrideInfo &info, bool enable_fast_math=false, const WeightsInfo &weights_info=WeightsInfo()) |
Set the input, weights, biases and output tensors. More... | |
void | run () override |
Run the kernels contained in the function. More... | |
void | prepare () override |
Prepare the function for executing. More... | |
Public Member Functions inherited from IFunction | |
virtual | ~IFunction ()=default |
Destructor. More... | |
Static Public Member Functions | |
static Status | validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *bias, const ITensorInfo *output, const PadStrideInfo &info, bool enable_fast_math=false, const WeightsInfo &weights_info=WeightsInfo()) |
Static function to check if given info will lead to a valid configuration of NEDeconvolutionLayer. More... | |
Function to run the deconvolution layer.
Deconvolution Layer is the backward pass of Convolution Layer. First we transform the input depending on the stride and pad info and then perfrom a 1x1 convolution pass. Input stride defines how many zeroes we should put between each element of the input, pad is the amount of padding and finaly a is a user specified value where a < stride - 1 that increases the padding top and right of the input image.
The relation between input to output is as follows:
\[ width\_output = (width\_input - 1) \cdot stride\_x - 2 \cdot padding\_x + kernel\_x \]
\[ height\_output = (height\_input - 1) \cdot stride\_y - 2 \cdot padding\_y + kernel\_y \]
where width is the size of the first input dimension. height is the size of the second input dimension. width_output is the size of the first output dimension. height_output is the size of the second output dimension. kernel_x and kernel_y are the convolution sizes in x and y. stride_x and stride_y is the input stride of the first and second dimension.
The weights used by Deconvolution are supposed to be the same as the ones used for Convolution. Therefore, it will be necessary to use the weights in the reverse order to perform an actual convolution. This is achieved by using NEReverse.
This function calls the following kernels/functions:
Definition at line 73 of file NEDeconvolutionLayer.h.
NEDeconvolutionLayer | ( | std::shared_ptr< IMemoryManager > | memory_manager = nullptr | ) |
Constructor.
Definition at line 70 of file NEDeconvolutionLayer.cpp.
|
delete |
Prevent instances of this class from being copied (As this class contains pointers)
|
default |
Default move constructor.
|
default |
Default destructor.
void configure | ( | ITensor * | input, |
const ITensor * | weights, | ||
const ITensor * | bias, | ||
ITensor * | output, | ||
const PadStrideInfo & | info, | ||
bool | enable_fast_math = false , |
||
const WeightsInfo & | weights_info = WeightsInfo() |
||
) |
Set the input, weights, biases and output tensors.
Valid data layouts:
Valid data type configurations:
src0 | src1 | src2 | dst |
---|---|---|---|
F16 | F16 | F16 | F16 |
F32 | F32 | F32 | F32 |
QASYMM8 | QASYMM8 | S32 | QASYMM8 |
QASYMM8 | QSYMM8_PER_CHANNEL | S32 | QASYMM8 |
QASYMM8_SIGNED | QASYMM8_SIGNED | S32 | QASYMM8_SIGNED |
QASYMM8_SIGNED | QSYMM8_PER_CHANNEL | S32 | QASYMM8_SIGNED |
[in,out] | input | Input tensor. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: F32/F16/QASYMM8/QASYMM8_SIGNED. |
[in] | weights | The 4d weights with dimensions [width, height, IFM, OFM]. Data type supported: Same as input , also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED. |
[in] | bias | Optional, ignored if NULL. The biases have one dimension. Data type supported: Data types supported: S32 for QASYMM8/QASYMM8_SIGNED input, F32 for F32 input, F16 for F16 input. |
[out] | output | Output tensor. The output has the same number of dimensions as the input . |
[in] | info | Contains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. |
[in] | enable_fast_math | (Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false |
[in] | weights_info | (Optional) Specifies the weight format. Default is unspecified. This parameter can be used to specify the weight format that is optimal for the GEMM convolution. |
Definition at line 195 of file NEDeconvolutionLayer.cpp.
References TensorAllocator::allocate(), Tensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, arm_compute::auto_init_if_empty(), bias, Tensor::buffer(), arm_compute::CEIL, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::misc::shape_calculator::compute_deconvolution_upsampled_shape(), CPPUpsample::configure(), NEReverse::configure(), NEConvolutionLayer::configure(), arm_compute::test::validation::conv_info, arm_compute::cpu::data_layout, arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::cpu::height_idx, ITensor::info(), arm_compute::test::validation::info, TensorAllocator::init(), arm_compute::test::validation::input, MemoryGroup::manage(), arm_compute::test::validation::output_shape, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), TensorInfo::set_data_layout(), arm_compute::utils::cast::U, arm_compute::U32, NEDeconvolutionLayer::validate(), arm_compute::test::validation::weights_info, arm_compute::WIDTH, and arm_compute::cpu::width_idx.
|
delete |
Prevent instances of this class from being copied (As this class contains pointers)
|
default |
Default move assignment operator.
|
overridevirtual |
Prepare the function for executing.
Any one off pre-processing step required by the function is handled here
Reimplemented from IFunction.
Definition at line 294 of file NEDeconvolutionLayer.cpp.
References TensorAllocator::allocate(), Tensor::allocator(), ARM_COMPUTE_ERROR_ON, ITensor::is_used(), ITensor::mark_as_unused(), NEConvolutionLayer::prepare(), and INESimpleFunctionNoBorder::run().
Referenced by NEDeconvolutionLayer::run().
|
overridevirtual |
Run the kernels contained in the function.
For CPU kernels:
For OpenCL kernels:
Implements IFunction.
Definition at line 281 of file NEDeconvolutionLayer.cpp.
References NEDeconvolutionLayer::prepare(), ICPPSimpleFunction::run(), and NEConvolutionLayer::run().
|
static |
Static function to check if given info will lead to a valid configuration of NEDeconvolutionLayer.
[in] | input | Input tensor info. 3 lower dimensions represent a single input, and an optional 4th dimension for batch of inputs. Data types supported: F32/F16/QASYMM8/QASYMM8_SIGNED. |
[in] | weights | The 4d weights info with dimensions [width, height, IFM, OFM]. Data type supported: Same as input , also could be QSYMM8_PER_CHANNEL if input is QASYMM8/QASYMM8_SIGNED. |
[in] | bias | (Optional) The biases have one dimension. Data type supported: Data types supported: S32 for QASYMM8/QASYMM8_SIGNED input, F32 for F32 input, F16 for F16 input. |
[in] | output | Output tensor info. The output has the same number of dimensions as the input . |
[in] | info | Contains padding and policies to be used in the deconvolution, this is described in PadStrideInfo. |
[in] | enable_fast_math | (Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false |
[in] | weights_info | (Optional) Specifies the weight format. Default is unspecified. This parameter can be used to specify the weight format that is optimal for the GEMM convolution. |
Definition at line 86 of file NEDeconvolutionLayer.cpp.
References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::BATCHES, bias, arm_compute::CEIL, arm_compute::CHANNEL, arm_compute::cpu::channel_idx, arm_compute::misc::shape_calculator::compute_deconvolution_output_shape(), arm_compute::misc::shape_calculator::compute_deconvolution_padding(), arm_compute::misc::shape_calculator::compute_deconvolution_upsampled_shape(), arm_compute::test::validation::conv_info, ITensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::deconvolution_output_dimensions(), ITensorInfo::dimension(), TensorInfo::dimension(), Window::DimX, Window::DimY, Window::DimZ, arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::cpu::height_idx, arm_compute::test::validation::info, arm_compute::test::validation::input, arm_compute::is_data_type_quantized(), arm_compute::is_data_type_quantized_asymmetric(), arm_compute::is_data_type_quantized_per_channel(), arm_compute::test::validation::output_shape, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8_PER_CHANNEL, arm_compute::S32, ITensorInfo::tensor_shape(), TensorShape::total_size(), arm_compute::utils::cast::U, NEConvolutionLayer::validate(), arm_compute::test::validation::weights_info, arm_compute::WIDTH, arm_compute::cpu::width_idx, Dimensions< T >::x(), Dimensions< T >::y(), and Dimensions< T >::z().
Referenced by NEDeconvolutionLayer::configure().