Compute Library
 21.02
NEDirectConvolutionLayerKernel Class Reference

Neon interface for Direct Convolution Layer kernel. More...

#include <NEDirectConvolutionLayerKernel.h>

Collaboration diagram for NEDirectConvolutionLayerKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEDirectConvolutionLayerKernel ()
 Default constructor. More...
 
 NEDirectConvolutionLayerKernel (const NEDirectConvolutionLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEDirectConvolutionLayerKerneloperator= (const NEDirectConvolutionLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEDirectConvolutionLayerKernel (NEDirectConvolutionLayerKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEDirectConvolutionLayerKerneloperator= (NEDirectConvolutionLayerKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NEDirectConvolutionLayerKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, const ITensor *weights, ITensor *output, const PadStrideInfo &conv_info)
 Set the input, weights, and output tensors. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
BorderSize border_size () const override
 The size of the border for that kernel. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *output, const PadStrideInfo &conv_info)
 Static function to check if given info will lead to a valid configuration of NEDirectConvolutionLayerKernel. More...
 

Detailed Description

Neon interface for Direct Convolution Layer kernel.

Definition at line 34 of file NEDirectConvolutionLayerKernel.h.

Constructor & Destructor Documentation

◆ NEDirectConvolutionLayerKernel() [1/3]

Default constructor.

Definition at line 1221 of file NEDirectConvolutionLayerKernel.cpp.

Referenced by NEDirectConvolutionLayerKernel::name().

1222  : _input(nullptr), _weights(nullptr), _output(nullptr), _conv_info(), _border_size(0), _kernel_size(0), _num_weight_elems_read_per_row(0), _num_elems_read_per_iteration(0),
1223  _num_elems_written_per_iteration(0)
1224 {
1225 }

◆ NEDirectConvolutionLayerKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEDirectConvolutionLayerKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEDirectConvolutionLayerKernel()

Default destructor.

Referenced by NEDirectConvolutionLayerKernel::name().

Member Function Documentation

◆ border_size()

BorderSize border_size ( ) const
overridevirtual

The size of the border for that kernel.

Returns
The width in number of elements of the border.

Reimplemented from IKernel.

Definition at line 1227 of file NEDirectConvolutionLayerKernel.cpp.

Referenced by NEDirectConvolutionLayerKernel::name(), and NEDirectConvolutionLayerKernel::validate().

1228 {
1229  return _border_size;
1230 }

◆ configure()

void configure ( const ITensor input,
const ITensor weights,
ITensor output,
const PadStrideInfo conv_info 
)

Set the input, weights, and output tensors.

Note
: DirectConvolution only works in the following configurations: 1x1 convolution with stride_x = 1/2/3, stride_y = 1/2/3 3x3 convolution with stride_x = 1/2/3, stride_y = 1/2/3
Parameters
[in]inputThe input tensor to convolve. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. The 3rd dimension must be the same as the input's volume 3rd dimension. Data type supported:Same as input.
[out]outputOutput tensor. The 3rd dimensions must be equal to the 4th dimension of the kernels tensor. Data types supported: F16/F32
[in]conv_infoContains padding and stride information described in PadStrideInfo.

Definition at line 1232 of file NEDirectConvolutionLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::misc::shape_calculator::compute_deep_convolution_shape(), arm_compute::test::validation::conv_info, ITensorInfo::data_layout(), arm_compute::test::validation::data_type, ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::get_data_layout_dimension_index(), ITensor::info(), arm_compute::test::validation::input, arm_compute::NCHW, arm_compute::test::validation::output_shape, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), arm_compute::validate_arguments(), and arm_compute::WIDTH.

Referenced by NEDirectConvolutionLayerKernel::name().

1233 {
1234  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
1235 
1236  _input = input;
1237  _weights = weights;
1238  _output = output;
1239  _conv_info = conv_info;
1240  _kernel_size = weights->info()->dimension(get_data_layout_dimension_index(weights->info()->data_layout(), DataLayoutDimension::WIDTH));
1241 
1242  const unsigned int conv_pad_left = conv_info.pad_left();
1243  const unsigned int conv_pad_top = conv_info.pad_top();
1244  const unsigned int conv_pad_right = conv_info.pad_right();
1245  const unsigned int conv_pad_bottom = conv_info.pad_bottom();
1246  if(_input->info()->data_layout() == DataLayout::NCHW)
1247  {
1248  _border_size = BorderSize(conv_pad_top, conv_pad_right, conv_pad_bottom, conv_pad_left);
1249  }
1250  else
1251  {
1252  _border_size = BorderSize(0);
1253  }
1254 
1255  // Get convolved dimensions
1256  TensorShape output_shape = misc::shape_calculator::compute_deep_convolution_shape(*input->info(), *weights->info(), conv_info);
1257 
1258  DataType data_type = input->info()->data_type();
1259 
1260  // Output auto inizialitation if not yet initialized
1261  auto_init_if_empty(*output->info(), output_shape, 1, data_type);
1262 
1263  // Perform validation step
1264  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), weights->info(), output->info(), conv_info));
1265 
1266  // Configure kernel window
1267  auto win_config = validate_and_configure_window(input->info(), weights->info(), output->info(), conv_info, _num_weight_elems_read_per_row,
1268  _num_elems_read_per_iteration, _num_elems_written_per_iteration, _border_size);
1269  ARM_COMPUTE_ERROR_THROW_ON(win_config.first);
1270  INEKernel::configure(win_config.second);
1271 }
TensorShape compute_deep_convolution_shape(const ITensorInfo &input, const ITensorInfo &weights, PadStrideInfo conv_info)
Calculate the deep convolution shape output shape of a tensor.
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
const DataType data_type
Definition: Im2Col.cpp:150
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
Num samples, channels, height, width.
Status validate_arguments(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo *output_stage)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
DataType
Available data types.
Definition: Types.h:77
virtual DataLayout data_layout() const =0
Get the data layout of the tensor.

◆ name()

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

Referenced by NEDirectConvolutionLayerKernel::name().

◆ operator=() [2/2]

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 1293 of file NEDirectConvolutionLayerKernel.cpp.

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensor::buffer(), ITensorInfo::data_layout(), ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), ITensor::info(), arm_compute::NCHW, arm_compute::WIDTH, and IKernel::window().

Referenced by NEDirectConvolutionLayerKernel::name().

1294 {
1298  ARM_COMPUTE_ERROR_ON(_input->buffer() == nullptr);
1299 
1300  const int kernel_size = _weights->info()->dimension(get_data_layout_dimension_index(_weights->info()->data_layout(), DataLayoutDimension::WIDTH));
1301 
1302  if(_input->info()->data_layout() == DataLayout::NCHW)
1303  {
1304  switch(kernel_size)
1305  {
1306  case 1:
1307  {
1308  switch(_input->info()->data_type())
1309  {
1310  case DataType::F32:
1311  convolve_1x1<float, float>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, _input, _weights, _output, _conv_info);
1312  break;
1313 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
1314  case DataType::F16:
1315  convolve_1x1<float16_t, float16_t>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, _input, _weights, _output, _conv_info);
1316  break;
1317 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
1318  default:
1319  ARM_COMPUTE_ERROR("Data type not supported");
1320  break;
1321  }
1322  break;
1323  }
1324  case 3:
1325  {
1326  switch(_input->info()->data_type())
1327  {
1328  case DataType::F32:
1329  convolve_3x3<float, float>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, _input, _weights, _output, _conv_info);
1330  break;
1331 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
1332  case DataType::F16:
1333  convolve_3x3<float16_t, float16_t>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, _input, _weights, _output, _conv_info);
1334  break;
1335 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
1336  default:
1337  ARM_COMPUTE_ERROR("Data type not supported");
1338  break;
1339  }
1340  break;
1341  }
1342  case 5:
1343  {
1344  switch(_input->info()->data_type())
1345  {
1346  case DataType::F32:
1347  convolve_5x5<float, float>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, _input, _weights, _output, _conv_info);
1348  break;
1349  default:
1350  ARM_COMPUTE_ERROR("Data type not supported");
1351  break;
1352  }
1353  break;
1354  }
1355  default:
1356  {
1357  ARM_COMPUTE_ERROR("Only kernel sizes 1x1, 3x3 and 5x5 are supported.");
1358  break;
1359  }
1360  }
1361  }
1362  else
1363  {
1364  switch(_input->info()->data_type())
1365  {
1366  case DataType::F32:
1367  {
1368  if(have_zero_x_internal_padding(_input->info(), _weights->info()))
1369  {
1370  convolve_nhwc_optimized<float>(window);
1371  }
1372  else
1373  {
1374  convolve_nhwc<float>(window);
1375  }
1376  break;
1377  }
1378  default:
1379  ARM_COMPUTE_ERROR("Data type not supported");
1380  break;
1381  }
1382  }
1383 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
1 channel, 1 F16 per channel
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
virtual uint8_t * buffer() const =0
Interface to be implemented by the child class to return a pointer to CPU memory. ...
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941
Num samples, channels, height, width.
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
virtual DataLayout data_layout() const =0
Get the data layout of the tensor.

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo output,
const PadStrideInfo conv_info 
)
static

Static function to check if given info will lead to a valid configuration of NEDirectConvolutionLayerKernel.

Parameters
[in]inputThe input tensor to convolve. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. The 3rd dimension must be the same as the input's volume 3rd dimension. Data type supported:Same as input.
[in]outputOutput tensor. The 3rd dimensions must be equal to the 4th dimension of the kernels tensor. Data types supported: F16/F32
[in]conv_infoContains padding and stride information described in PadStrideInfo.
Returns
a status

Definition at line 1273 of file NEDirectConvolutionLayerKernel.cpp.

References ARM_COMPUTE_RETURN_ON_ERROR, NEDirectConvolutionLayerKernel::border_size(), ICloneable< T >::clone(), arm_compute::test::validation::conv_info, and arm_compute::validate_arguments().

Referenced by NEDirectConvolutionLayerKernel::name(), and NEDirectConvolutionLayer::validate().

1274 {
1275  unsigned int num_weight_elems_read_per_row = 0;
1276  unsigned int num_elems_read_per_iteration = 0;
1277  unsigned int num_elems_written_per_iteration = 0;
1278  BorderSize border_size = {};
1280  ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window(input->clone().get(),
1281  weights->clone().get(),
1282  output->clone().get(),
1283  conv_info,
1284  num_weight_elems_read_per_row,
1285  num_elems_read_per_iteration,
1286  num_elems_written_per_iteration,
1287  border_size)
1288  .first);
1289 
1290  return Status{};
1291 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
BorderSize border_size() const override
The size of the border for that kernel.
Status validate_arguments(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo *output_stage)

The documentation for this class was generated from the following files: