Compute Library
 21.08
CpuDirectConv2dKernel Class Reference

Interface for the kernel to perform Direct Convolution Layer. More...

#include <CpuDirectConv2dKernel.h>

Collaboration diagram for CpuDirectConv2dKernel:
[legend]

Public Member Functions

 CpuDirectConv2dKernel ()=default
 
 ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (CpuDirectConv2dKernel)
 
void configure (ITensorInfo *src, ITensorInfo *weights, ITensorInfo *dst, const PadStrideInfo &conv_info)
 Set the src, weights, and dst tensors. More...
 
void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
const char * name () const override
 Name of the kernel. More...
 
BorderSize border_size () const override
 The size of the border for that kernel. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run (const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *dst, const PadStrideInfo &conv_info)
 Static function to check if given info will lead to a valid configuration. More...
 

Detailed Description

Interface for the kernel to perform Direct Convolution Layer.

Definition at line 37 of file CpuDirectConv2dKernel.h.

Constructor & Destructor Documentation

◆ CpuDirectConv2dKernel()

CpuDirectConv2dKernel ( )
default

Member Function Documentation

◆ ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE()

ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE ( CpuDirectConv2dKernel  )

◆ border_size()

BorderSize border_size ( ) const
overridevirtual

The size of the border for that kernel.

Returns
The width in number of elements of the border.

Reimplemented from IKernel.

Definition at line 1222 of file CpuDirectConv2dKernel.cpp.

1223 {
1224  return _border_size;
1225 }

◆ configure()

void configure ( ITensorInfo src,
ITensorInfo weights,
ITensorInfo dst,
const PadStrideInfo conv_info 
)

Set the src, weights, and dst tensors.

Note
: DirectConvolution only works in the following configurations: 1x1 convolution with stride_x = 1/2/3, stride_y = 1/2/3 3x3 convolution with stride_x = 1/2/3, stride_y = 1/2/3
Parameters
[in]srcThe input tensor to convolve. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. The 3rd dimension must be the same as the input's volume 3rd dimension. Data type supported:Same as input.
[out]dstOutput tensor. The 3rd dimensions must be equal to the 4th dimension of the kernels tensor. Data types supported: F16/F32
[in]conv_infoContains padding and stride information described in PadStrideInfo.

Definition at line 1227 of file CpuDirectConv2dKernel.cpp.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::misc::shape_calculator::compute_deep_convolution_shape(), arm_compute::test::validation::configure(), arm_compute::test::validation::conv_info, ITensorInfo::data_layout(), ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::get_data_layout_dimension_index(), arm_compute::NCHW, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), and arm_compute::WIDTH.

1228 {
1230 
1231  _conv_info = conv_info;
1232  _data_layout = src->data_layout();
1233  _kernel_size = weights->dimension(get_data_layout_dimension_index(_data_layout, DataLayoutDimension::WIDTH));
1234 
1235  const unsigned int conv_pad_left = conv_info.pad_left();
1236  const unsigned int conv_pad_top = conv_info.pad_top();
1237  const unsigned int conv_pad_right = conv_info.pad_right();
1238  const unsigned int conv_pad_bottom = conv_info.pad_bottom();
1239  if(_data_layout == DataLayout::NCHW)
1240  {
1241  _border_size = BorderSize(conv_pad_top, conv_pad_right, conv_pad_bottom, conv_pad_left);
1242  }
1243  else
1244  {
1245  _border_size = BorderSize(0);
1246  }
1247 
1248  // Get convolved dimensions
1250 
1251  DataType data_type = src->data_type();
1252 
1253  // Output auto inizialitation if not yet initialized
1254  auto_init_if_empty(*dst, output_shape, 1, data_type);
1255 
1256  // Perform validation step
1257  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(src, weights, dst, conv_info));
1258 
1259  // Configure kernel window
1260  auto win_config = validate_and_configure_window(src, weights, dst, conv_info, _num_weight_elems_read_per_row,
1261  _num_elems_read_per_iteration, _num_elems_written_per_iteration, _border_size);
1262  ARM_COMPUTE_ERROR_THROW_ON(win_config.first);
1263  ICpuKernel::configure(win_config.second);
1264 }
const size_t conv_pad_left
TensorShape compute_deep_convolution_shape(const ITensorInfo &input, const ITensorInfo &weights, PadStrideInfo conv_info)
Calculate the deep convolution shape output shape of a tensor.
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
SimpleTensor< float > src
Definition: DFT.cpp:155
const DataType data_type
Definition: Im2Col.cpp:150
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Num samples, channels, height, width.
const size_t conv_pad_top
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
DataType
Available data types.
Definition: Types.h:77

◆ name()

const char * name ( ) const
overridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 1379 of file CpuDirectConv2dKernel.cpp.

1380 {
1381  return "CpuDirectConvolutionLayerKernel";
1382 }

◆ run_op()

void run_op ( ITensorPack tensors,
const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]tensorsA vector containing the tensors to operate on.
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 1286 of file CpuDirectConv2dKernel.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensorInfo::dimension(), arm_compute::test::validation::dst, arm_compute::F16, arm_compute::F32, ITensorPack::get_const_tensor(), arm_compute::get_data_layout_dimension_index(), ITensorPack::get_tensor(), ITensor::info(), arm_compute::NCHW, arm_compute::test::validation::src, and arm_compute::WIDTH.

1287 {
1291 
1292  auto src = tensors.get_const_tensor(TensorType::ACL_SRC_0);
1293  auto weights = tensors.get_const_tensor(TensorType::ACL_SRC_1);
1294  auto dst = tensors.get_tensor(TensorType::ACL_DST);
1295  const int kernel_size = weights->info()->dimension(get_data_layout_dimension_index(_data_layout, DataLayoutDimension::WIDTH));
1296 
1297  if(_data_layout == DataLayout::NCHW)
1298  {
1299  switch(kernel_size)
1300  {
1301  case 1:
1302  {
1303  switch(src->info()->data_type())
1304  {
1305  case DataType::F32:
1306  convolve_1x1<float, float>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, src, weights, dst, _conv_info);
1307  break;
1308 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
1309  case DataType::F16:
1310  convolve_1x1<float16_t, float16_t>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, src, weights, dst, _conv_info);
1311  break;
1312 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
1313  default:
1314  ARM_COMPUTE_ERROR("Data type not supported");
1315  break;
1316  }
1317  break;
1318  }
1319  case 3:
1320  {
1321  switch(src->info()->data_type())
1322  {
1323  case DataType::F32:
1324  convolve_3x3<float, float>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, src, weights, dst, _conv_info);
1325  break;
1326 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
1327  case DataType::F16:
1328  convolve_3x3<float16_t, float16_t>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, src, weights, dst, _conv_info);
1329  break;
1330 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
1331  default:
1332  ARM_COMPUTE_ERROR("Data type not supported");
1333  break;
1334  }
1335  break;
1336  }
1337  case 5:
1338  {
1339  switch(src->info()->data_type())
1340  {
1341  case DataType::F32:
1342  convolve_5x5<float, float>(window, _num_elems_read_per_iteration, _num_elems_written_per_iteration, src, weights, dst, _conv_info);
1343  break;
1344  default:
1345  ARM_COMPUTE_ERROR("Data type not supported");
1346  break;
1347  }
1348  break;
1349  }
1350  default:
1351  {
1352  ARM_COMPUTE_ERROR("Only kernel sizes 1x1, 3x3 and 5x5 are supported.");
1353  break;
1354  }
1355  }
1356  }
1357  else
1358  {
1359  switch(src->info()->data_type())
1360  {
1361  case DataType::F32:
1362  {
1363  if(have_zero_x_internal_padding(src->info(), weights->info()))
1364  {
1365  convolve_nhwc_optimized<float>(window, src, weights, dst);
1366  }
1367  else
1368  {
1369  convolve_nhwc<float>(window, src, weights, dst);
1370  }
1371  break;
1372  }
1373  default:
1374  ARM_COMPUTE_ERROR("Data type not supported");
1375  break;
1376  }
1377  }
1378 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
1 channel, 1 F32 per channel
SimpleTensor< float > src
Definition: DFT.cpp:155
1 channel, 1 F16 per channel
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:915
Num samples, channels, height, width.
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:201

◆ validate()

Status validate ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo dst,
const PadStrideInfo conv_info 
)
static

Static function to check if given info will lead to a valid configuration.

Similar to CpuDirectConv2dKernel::configure()

Returns
a status

Definition at line 1266 of file CpuDirectConv2dKernel.cpp.

References ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::test::validation::conv_info.

Referenced by CpuDirectConv2d::validate().

1267 {
1268  unsigned int num_weight_elems_read_per_row = 0;
1269  unsigned int num_elems_read_per_iteration = 0;
1270  unsigned int num_elems_written_per_iteration = 0;
1271  BorderSize border_size = {};
1272  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(src, weights, dst, conv_info));
1273  ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window(src->clone().get(),
1274  weights->clone().get(),
1275  dst->clone().get(),
1276  conv_info,
1277  num_weight_elems_read_per_row,
1278  num_elems_read_per_iteration,
1279  num_elems_written_per_iteration,
1280  border_size)
1281  .first);
1282 
1283  return Status{};
1284 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
BorderSize border_size() const override
The size of the border for that kernel.
SimpleTensor< float > src
Definition: DFT.cpp:155

The documentation for this class was generated from the following files: