Interface for the im2col reshape kernel. More...

#include <CpuIm2ColKernel.h>

Collaboration diagram for CpuIm2ColKernel:

Public Member Functions
	CpuIm2ColKernel ()=default
	Default constructor. More...

	ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (CpuIm2ColKernel)

void	configure (const ITensorInfo src, ITensorInfo dst, const Size2D &kernel_dims, const PadStrideInfo &conv_info, bool has_bias, const Size2D &dilation=Size2D(1U, 1U), unsigned int num_groups=1, unsigned int input_pad_right=0)
	Set the input and output of the kernel. More...

void	run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) override
	Execute the kernel on the passed window. More...

const char *	name () const override
	Name of the kernel. More...

size_t	get_mws (const CPUInfo &platform, size_t thread_count) const override
	Return minimum workload size of the relevant kernel. More...

Public Member Functions inherited from ICPPKernel
virtual	~ICPPKernel ()=default
	Default destructor. More...

virtual void	run (const Window &window, const ThreadInfo &info)
	Execute the kernel on the passed window. More...

virtual void	run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
	legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...

Public Member Functions inherited from IKernel
	IKernel ()
	Constructor. More...

virtual	~IKernel ()=default
	Destructor. More...

virtual bool	is_parallelisable () const
	Indicates whether or not the kernel is parallelisable. More...

virtual BorderSize	border_size () const
	The size of the border for that kernel. More...

const Window &	window () const
	The maximum window the kernel can be executed on. More...

bool	is_window_configured () const
	Function to check if the embedded window of this kernel has been configured. More...

Static Public Member Functions
static Status	validate (const ITensorInfo src, const ITensorInfo dst, const Size2D &kernel_dims, const PadStrideInfo &conv_info, bool has_bias, const Size2D &dilation=Size2D(1U, 1U), unsigned int num_groups=1, unsigned int input_pad_right=0)
	Static function to check if given info will lead to a valid configuration. More...

Static Public Member Functions inherited from ICpuKernel< CpuIm2ColKernel >
static const auto *	get_implementation (const SelectorType &selector, KernelSelectionType selection_type=KernelSelectionType::Supported)
	Micro-kernel selector. More...

Additional Inherited Members
Static Public Attributes inherited from ICPPKernel
static constexpr size_t	default_mws = 1

Detailed Description

Interface for the im2col reshape kernel.

Rearranges image blocks into columns. It is used to strip out each convolution block to a single column. It is used to transform a convolution to a plain matrix multiplication.

For example taking into account the image below and assuming 3x3 image blocks with stride of 1 we have:

\[ \left( \begin{array}{cccc} a00 & a01 & a02 & a03 \\ a10 & a11 & a12 & a13 \\ a20 & a21 & a22 & a23 \\ a30 & a31 & a32 & a33 \\ \end{array} \right) \rightarrow \left( \begin{array}{ccccccccc} a00 & a01 & a02 & a10 & a11 & a12 & a20 & a21 & a22 \\ a01 & a02 & a03 & a11 & a12 & a13 & a21 & a22 & a23 \\ a10 & a11 & a12 & a20 & a21 & a22 & a30 & a31 & a32 \\ a11 & a12 & a13 & a21 & a22 & a23 & a31 & a32 & a33 \\ \end{array} \right) \]

Definition at line 62 of file CpuIm2ColKernel.h.

Constructor & Destructor Documentation

◆ CpuIm2ColKernel()

CpuIm2ColKernel ( )

default

Default constructor.

Member Function Documentation

◆ ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE()

ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE ( CpuIm2ColKernel )

◆ configure()

void configure	(	const ITensorInfo *	src,
		ITensorInfo *	dst,
		const Size2D &	kernel_dims,
		const PadStrideInfo &	conv_info,
		bool	has_bias,
		const Size2D &	dilation = `Size2D(1U, 1U)`,
		unsigned int	num_groups = `1`,
		unsigned int	input_pad_right = `0`
	)

Set the input and output of the kernel.

Parameters

[in]	src	The input tensor info to convert. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/BFLOAT16/F16/F32 Note: QASYMM8/QASYMM8_SIGNED works only for has_bias = false
[out]	dst	The output tensor info. Data types supported: Same as `input`
[in]	kernel_dims	The kernel dimensions (width and height).
[in]	conv_info	Contains padding and stride information described in PadStrideInfo.
[in]	has_bias	In case biases are provided expands the matrix with 1.
[in]	dilation	(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]	num_groups	(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported
[in]	input_pad_right	(Optional) When fast-math is selected, per element padding for the im2col matrix may be necessary

Definition at line 285 of file CpuIm2ColKernel.cpp.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(src, dst);
     ARM_COMPUTE_ERROR_THROW_ON(
         validate_arguments(src, dst, kernel_dims, conv_info, has_bias, dilation, num_groups, input_pad_right));
     ARM_COMPUTE_UNUSED(num_groups);
  
     _data_layout                   = src->data_layout();
     const unsigned int width_idx   = get_data_layout_dimension_index(_data_layout, DataLayoutDimension::WIDTH);
     const unsigned int height_idx  = get_data_layout_dimension_index(_data_layout, DataLayoutDimension::HEIGHT);
     const unsigned int channel_idx = get_data_layout_dimension_index(_data_layout, DataLayoutDimension::CHANNEL);
  
     _conv_info       = conv_info;
     _kernel_width    = kernel_dims.width;
     _kernel_height   = kernel_dims.height;
     _input_pad_right = input_pad_right;
     _dilation        = dilation;
     _convolved_dims  = scaled_dimensions(src->dimension(width_idx), dst->dimension(height_idx), _kernel_width,
                                          _kernel_height, _conv_info, _dilation);
     _has_bias        = has_bias;
  
     if (_data_layout == DataLayout::NCHW)
     {
         switch (src->data_type())
         {
             case DataType::F32:
                 _func = (!conv_info.has_padding()) ? &run_im2col_fp32_nchw_nopad : &run_im2col_fp32_nchw_pad;
                 break;
             case DataType::F16:
                 _func = (!conv_info.has_padding()) ? &internal_run_im2col_fp16_nchw_nopad
                                                    : &internal_run_im2col_fp16_nchw_pad;
                 break;
 #if defined(ARM_COMPUTE_ENABLE_BF16)
             case DataType::BFLOAT16:
                 _func = (!conv_info.has_padding()) ? &run_im2col_bf16_nchw_nopad : &run_im2col_bf16_nchw_pad;
                 break;
 #endif /* defined(ARM_COMPUTE_ENABLE_BF16) */
             case DataType::QASYMM8_SIGNED:
             case DataType::QASYMM8:
                 _func = (!conv_info.has_padding()) ? &run_im2col_qasymm8_nchw_nopad : &run_im2col_qasymm8_nchw_pad;
                 break;
             default:
                 ARM_COMPUTE_ERROR("Data type not supported");
                 break;
         }
     }
     else
     {
         switch (src->data_type())
         {
             case DataType::F32:
                 _func = (!conv_info.has_padding()) ? &run_im2col_fp32_nopad : &run_im2col_fp32_pad;
                 break;
             case DataType::F16:
                 _func = (!conv_info.has_padding()) ? &internal_run_im2col_fp16_nopad : &internal_run_im2col_fp16_pad;
                 break;
 #if defined(ARM_COMPUTE_ENABLE_BF16)
             case DataType::BFLOAT16:
                 _func = (!conv_info.has_padding()) ? &run_im2col_bf16_nopad : &run_im2col_bf16_pad;
                 break;
 #endif /* defined(ARM_COMPUTE_ENABLE_BF16) */
             case DataType::QASYMM8:
                 _func = (!conv_info.has_padding()) ? &run_im2col_uint8_nopad_nhwc : &run_im2col_qasymm8_pad_nhwc;
                 break;
             case DataType::QASYMM8_SIGNED:
                 _func = (!conv_info.has_padding()) ? &run_im2col_int8_nopad_nhwc : &run_im2col_qasymm8_pad_nhwc;
                 break;
             default:
                 ARM_COMPUTE_ERROR("Data type not supported");
                 break;
         }
     }
  
     // Output tensor auto initialization if not yet initialized
     auto_init_if_empty(
         *dst, src->clone()->set_tensor_shape(compute_im2col_conv_shape(src, kernel_dims, conv_info, has_bias, dilation,
                                                                        false, num_groups, _input_pad_right)));
  
     std::pair<unsigned int, unsigned int> convolved_dims =
         scaled_dimensions(src->dimension(width_idx), src->dimension(height_idx), kernel_dims.width, kernel_dims.height,
                           conv_info, dilation);
  
     Window win = calculate_max_window(*src, Steps());
     win.set(width_idx, Window::Dimension(0, convolved_dims.first, 1));
     win.set(height_idx, Window::Dimension(0, convolved_dims.second, 1));
     win.set(channel_idx, Window::Dimension(0, 1, 1));
     // Configure kernel window
     ICpuKernel::configure(win);
 }

◆ get_mws()

size_t get_mws	(	const CPUInfo &	platform,
		size_t	thread_count
	)		const

overridevirtual

Return minimum workload size of the relevant kernel.

Parameters

[in]	platform	The CPU platform used to create the context.
[in]	thread_count	Number of threads in the execution.

Returns: [out] small_network_mws Minimum workload size for requsted configuration.

Reimplemented from ICPPKernel.

Definition at line 414 of file CpuIm2ColKernel.cpp.

 {
     ARM_COMPUTE_UNUSED(thread_count);
     ARM_COMPUTE_UNUSED(platform);
  
     return ICPPKernel::default_mws;
 }

References ARM_COMPUTE_UNUSED, and ICPPKernel::default_mws.

◆ name()

const char * name ( ) const

overridevirtual

Name of the kernel.

Returns: Kernel name

Implements ICPPKernel.

Definition at line 409 of file CpuIm2ColKernel.cpp.

 {
     return "CpuIm2ColKernel";
 }

◆ run_op()

void run_op	(	ITensorPack &	tensors,
		const Window &	window,
		const ThreadInfo &	info
	)

overridevirtual

Execute the kernel on the passed window.

Warning: If is_parallelisable() returns false then the passed window must be equal to window()

Note: The window has to be a region within the window returned by the window() method; The width of the window has to be a multiple of num_elems_processed_per_iteration().

Parameters

[in]	tensors	A vector containing the tensors to operate on.
[in]	window	Region on which to execute the kernel. (Must be a region of the window returned by window())
[in]	info	Info about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 396 of file CpuIm2ColKernel.cpp.

 {
     ARM_COMPUTE_UNUSED(info);
     ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(this);
     ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(ICpuKernel::window(), window);
  
     auto src = tensors.get_const_tensor(TensorType::ACL_SRC);
     auto dst = tensors.get_tensor(TensorType::ACL_DST);
  
     _func(src, dst, window, _data_layout, _conv_info, _convolved_dims, Size2D(_kernel_width, _kernel_height), _dilation,
           _input_pad_right, _has_bias);
 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::test::validation::dst, ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), arm_compute::test::validation::info, and arm_compute::test::validation::src.

◆ validate()

Status validate	(	const ITensorInfo *	src,
		const ITensorInfo *	dst,
		const Size2D &	kernel_dims,
		const PadStrideInfo &	conv_info,
		bool	has_bias,
		const Size2D &	dilation = `Size2D(1U, 1U)`,
		unsigned int	num_groups = `1`,
		unsigned int	input_pad_right = `0`
	)

static

Static function to check if given info will lead to a valid configuration.

Similar to CpuIm2ColKernel::configure()

Returns: a status

Definition at line 382 of file CpuIm2ColKernel.cpp.

 {
     ARM_COMPUTE_RETURN_ON_ERROR(
         validate_arguments(src, dst, kernel_dims, conv_info, has_bias, dilation, num_groups, input_pad_right));
     return Status{};
 }

References ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::test::validation::conv_info, arm_compute::test::validation::dst, arm_compute::test::validation::has_bias, arm_compute::test::validation::num_groups, arm_compute::test::validation::src, and arm_compute::cpu::kernels::validate_arguments().

Referenced by CpuGemmConv2d::validate().

The documentation for this class was generated from the following files:

src/cpu/kernels/CpuIm2ColKernel.h
src/cpu/kernels/CpuIm2ColKernel.cpp

Public Member Functions

Static Public Member Functions

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

◆ CpuIm2ColKernel()

Member Function Documentation

◆ ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE()

◆ configure()

◆ get_mws()

◆ name()

◆ run_op()

◆ validate()