Compute Library
 19.08
NEWeightsReshapeKernel Class Reference

NEON kernel to perform reshaping on the weights used by convolution and locally connected layer. More...

#include <NEWeightsReshapeKernel.h>

Collaboration diagram for NEWeightsReshapeKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEWeightsReshapeKernel ()
 Constructor. More...
 
 NEWeightsReshapeKernel (const NEWeightsReshapeKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEWeightsReshapeKerneloperator= (const NEWeightsReshapeKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEWeightsReshapeKernel (NEWeightsReshapeKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEWeightsReshapeKerneloperator= (NEWeightsReshapeKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NEWeightsReshapeKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, const ITensor *bias, ITensor *output)
 Set the input and output of the kernel. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *biases, const ITensorInfo *output)
 Static function to check if given info will lead to a valid configuration of NEWeightsReshapeKernel. More...
 

Detailed Description

NEON kernel to perform reshaping on the weights used by convolution and locally connected layer.

Rearranges each 3-dimensional kernel to a single row leading to a matrix with linearized kernels. In combination with the NEIm2ColKernel can transform a convolution to a matrix multiplication.

For example assuming a 3D weight kernel of 3x3 dimensions and depth of 2 we have:

\[ \left( \begin{array}{ccc} a000 & a001 & a002 \\ a010 & a011 & a012 \\ a020 & a021 & a022 \\ \end{array} \right) \left( \begin{array}{ccc} a100 & a101 & a102 \\ a110 & a111 & a112 \\ a120 & a121 & a122 \\ \end{array} \right) \rightarrow \left( \begin{array}{ccccccccc} a000 & a001 & a002 & a010 & a011 & a012 & a020 & a021 & a022 & a100 & a101 & a102 & a110 & a111 & a112 & a120 & a121 & a122 \\ \end{array} \right) \]

Definition at line 56 of file NEWeightsReshapeKernel.h.

Constructor & Destructor Documentation

◆ NEWeightsReshapeKernel() [1/3]

Constructor.

Definition at line 90 of file NEWeightsReshapeKernel.cpp.

91  : _input(nullptr), _bias(nullptr), _output(nullptr)
92 {
93 }

◆ NEWeightsReshapeKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEWeightsReshapeKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEWeightsReshapeKernel()

~NEWeightsReshapeKernel ( )
default

Default destructor.

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
const ITensor bias,
ITensor output 
)

Set the input and output of the kernel.

Parameters
[in]inputThe input tensor to convert. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM] if shared, and 5D tensor with dimensions [kernel_x, kernel_y, IFM, OFM, num_patches] if unshared. Data types supported: QASYMM8/F32
[in]biasThe shared biases tensor to append. Bias is 1D tensor with dimensions [OFM] if shared and 2D tensor with dimensions [OFM, num_patches] if unshared. Data types supported: Same as input
Warning
Appending biases to weights reshaped matrix is not supported for quantized asymmetric types.
Parameters
[out]outputThe output tensor. Data types supported: Same as input

Definition at line 95 of file NEWeightsReshapeKernel.cpp.

96 {
97  ARM_COMPUTE_ERROR_ON_NULLPTR(input, output);
98 
99  // Output tensor auto inizialitation if not yet initialized
100  auto_init_if_empty(*output->info(), input->info()->clone()->set_tensor_shape(get_output_shape(input->info(), (bias != nullptr))));
101 
102  // Perform validation step
103  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(),
104  (bias != nullptr) ? bias->info() : nullptr,
105  output->info()));
106 
107  _input = input;
108  _bias = bias;
109  _output = output;
110 
111  // Configure kernel
112  auto win_config = validate_and_configure_window(input->info(), output->info());
113  ARM_COMPUTE_ERROR_THROW_ON(win_config.first);
114  INEKernel::configure(win_config.second);
115 }
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: CLTensor.cpp:35
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Definition: Helpers.inl:201
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::test::validation::bias, ICloneable< T >::clone(), ITensor::info(), CLTensor::info(), and arm_compute::validate_and_configure_window().

Referenced by NEConvolutionLayerReshapeWeights::configure(), and NELocallyConnectedLayer::configure().

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 59 of file NEWeightsReshapeKernel.h.

60  {
61  return "NEWeightsReshapeKernel";
62  }

◆ operator=() [1/2]

NEWeightsReshapeKernel& operator= ( const NEWeightsReshapeKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEWeightsReshapeKernel& operator= ( NEWeightsReshapeKernel &&  )
default

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 125 of file NEWeightsReshapeKernel.cpp.

126 {
130 
131  const unsigned int kernel_size_x = _input->info()->dimension(0);
132  const unsigned int kernel_size_y = _input->info()->dimension(1);
133  const unsigned int kernel_depth = _input->info()->dimension(2);
134  const unsigned int input_stride_x = _input->info()->strides_in_bytes().x();
135  const unsigned int input_stride_y = _input->info()->strides_in_bytes().y();
136  const unsigned int input_stride_z = _input->info()->strides_in_bytes().z();
137  const unsigned int output_stride_y = _output->info()->strides_in_bytes().y();
138 
139  // Create iterators
140  Iterator in(_input, window);
141  execute_window_loop(window, [&](const Coordinates & id)
142  {
143  // Get column index
144  const int kernel_idx = id[3];
145  const int kernel_idz = id[4];
146 
147  // Setup pointers
148  const uint8_t *tmp_input_ptr = in.ptr();
149  uint8_t *tmp_output_ptr = _output->ptr_to_element(Coordinates(kernel_idx, 0, kernel_idz));
150  const uint8_t *curr_input_row_ptr = tmp_input_ptr;
151  const uint8_t *curr_input_depth_ptr = tmp_input_ptr;
152 
153  // Linearize volume
154  for(unsigned int d = 0; d < kernel_depth; ++d)
155  {
156  for(unsigned int j = 0; j < kernel_size_y; ++j)
157  {
158  for(unsigned int i = 0; i < kernel_size_x; ++i)
159  {
160  std::memcpy(tmp_output_ptr, tmp_input_ptr, _input->info()->element_size());
161  tmp_input_ptr += input_stride_x;
162  tmp_output_ptr += output_stride_y;
163  }
164  curr_input_row_ptr += input_stride_y;
165  tmp_input_ptr = curr_input_row_ptr;
166  }
167  curr_input_depth_ptr += input_stride_z;
168  curr_input_row_ptr = curr_input_depth_ptr;
169  tmp_input_ptr = curr_input_depth_ptr;
170  }
171 
172  // Add bias
173  if(_bias != nullptr)
174  {
175  std::memcpy(tmp_output_ptr, _bias->ptr_to_element(Coordinates(kernel_idx, kernel_idz)), _input->info()->element_size());
176  }
177  },
178  in);
179 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
uint8_t * ptr_to_element(const Coordinates &id) const
Return a pointer to the element at the passed coordinates.
Definition: ITensor.h:63
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
T x() const
Alias to access the size of the first dimension.
Definition: Dimensions.h:81
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
T z() const
Alias to access the size of the third dimension.
Definition: Dimensions.h:91
Coordinates of an item.
Definition: Coordinates.h:37
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
virtual size_t element_size() const =0
Element size in bytes calculated as data_size() * num_channels()
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:122
T y() const
Alias to access the size of the second dimension.
Definition: Dimensions.h:86
virtual const Strides & strides_in_bytes() const =0
The strides in bytes for accessing each dimension of the tensor.
Iterator updated by execute_window_loop for each window element.
Definition: Helpers.h:318
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensorInfo::dimension(), ITensorInfo::element_size(), arm_compute::execute_window_loop(), ITensor::info(), arm_compute::test::validation::info, Iterator::ptr(), ITensor::ptr_to_element(), ITensorInfo::strides_in_bytes(), IKernel::window(), Dimensions< T >::x(), Dimensions< T >::y(), and Dimensions< T >::z().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo biases,
const ITensorInfo output 
)
static

Static function to check if given info will lead to a valid configuration of NEWeightsReshapeKernel.

Parameters
[in]inputThe input tensor to convert. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM] if shared, and 5D tensor with dimensions [kernel_x, kernel_y, IFM, OFM, num_patches] if unshared. Data types supported: QASYMM8/F16/F32
[in]biasesThe shared biases tensor to append. Bias is 1D tensor with dimensions [OFM] if shared and 2D tensor with dimensions [OFM, num_patches] if unshared. Data types supported: Same as input
Warning
Appending biases to weights reshaped matrix is not supported for quantized asymmetric types.
Parameters
[in]outputThe output tensor. Should be a 2D Tensor. Data types supported: Same as input
Returns
a status

Definition at line 117 of file NEWeightsReshapeKernel.cpp.

118 {
119  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input, biases, output));
120  ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window(input->clone().get(), output->clone().get()).first);
121 
122  return Status{};
123 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193
Status class.
Definition: Error.h:52
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.

References ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::validate_and_configure_window().

Referenced by NEConvolutionLayerReshapeWeights::validate(), and NELocallyConnectedLayer::validate().


The documentation for this class was generated from the following files: