Compute Library
 21.02
NEGEMMConvolutionLayer Class Reference

Basic function to compute the convolution layer. More...

#include <NEGEMMConvolutionLayer.h>

Collaboration diagram for NEGEMMConvolutionLayer:
[legend]

Public Member Functions

 NEGEMMConvolutionLayer (const std::shared_ptr< IMemoryManager > &memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
 Constructor. More...
 
 NEGEMMConvolutionLayer (const NEGEMMConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEGEMMConvolutionLayer (NEGEMMConvolutionLayer &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
NEGEMMConvolutionLayeroperator= (const NEGEMMConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEGEMMConvolutionLayeroperator= (NEGEMMConvolutionLayer &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
 ~NEGEMMConvolutionLayer ()
 Default destructor. More...
 
void configure (const ITensor *input, const ITensor *weights, const ITensor *biases, ITensor *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), unsigned int num_groups=1)
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), unsigned int num_groups=1)
 Static function to check if given info will lead to a valid configuration of NEGEMMConvolutionLayer. More...
 

Detailed Description

Basic function to compute the convolution layer.

This function calls the following Neon kernels/functions:

  1. NEIm2ColKernel
  2. NEGEMM (if the data type is BFLOAT16/FP16/FP32)
  3. NEGEMMLowpMatrixMultiplyCore (if the data type is QASYMM8/QASYMM8_SIGNED)
  4. NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint (if the data type is QASYMM8/QASYMM8_SIGNED)
  5. NEArithmeticAddition (if biases != nullptr and we have a 1x1 convolution with the NHWC data layout)
  6. NECol2ImKernel (if NCHW data layout)

Definition at line 163 of file NEGEMMConvolutionLayer.h.

Constructor & Destructor Documentation

◆ NEGEMMConvolutionLayer() [1/3]

NEGEMMConvolutionLayer ( const std::shared_ptr< IMemoryManager > &  memory_manager = nullptr,
IWeightsManager weights_manager = nullptr 
)

Constructor.

Definition at line 110 of file NEGEMMConvolutionLayer.cpp.

111  : _memory_group(memory_manager), _weights_manager(weights_manager), _reshape_weights(), _reshape_weights_managed(), _im2col_kernel(), _mm_gemm(memory_manager), _mm_gemmlowp(memory_manager),
112  _col2im_kernel(), _reshape_layer(), _original_weights(nullptr), _original_output(nullptr), _im2col_output(), _weights_reshaped(), _gemm_output(), _gemm_output_3d(), _tmp_output(),
113  _data_layout(DataLayout::NCHW), _skip_im2col(false), _skip_col2im(false), _is_quantized(false), _is_prepared(false)
114 {
115 }
Num samples, channels, height, width.

◆ NEGEMMConvolutionLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEGEMMConvolutionLayer() [3/3]

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ ~NEGEMMConvolutionLayer()

~NEGEMMConvolutionLayer ( )
default

Default destructor.

Referenced by NEConvolutionLayerReshapeWeights::run().

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
const ITensor weights,
const ITensor biases,
ITensor output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
unsigned int  num_groups = 1 
)

Set the input and output tensors.

Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/BFLOAT16/F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL/BFLOAT16/F16/F32.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match input data type, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with NEGEMMTranspose1xWKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU supported.
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported

Definition at line 258 of file NEGEMMConvolutionLayer.cpp.

References IWeightsManager::acquire(), TensorAllocator::allocate(), Tensor::allocator(), IWeightsManager::are_weights_managed(), ARM_COMPUTE_ERROR_ON_MSG, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_UNUSED, arm_compute::BATCHES, arm_compute::BFLOAT16, NEReshapeLayer::configure(), NEConvolutionLayerReshapeWeights::configure(), NEConvolutionLayerReshapeWeightsTransform::configure(), arm_compute::test::validation::conv_info, arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), arm_compute::test::validation::data_type, ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, ITensor::info(), Tensor::info(), TensorAllocator::init(), arm_compute::test::validation::input, arm_compute::is_data_type_quantized_asymmetric(), MemoryGroup::manage(), arm_compute::NCHW, arm_compute::NHWC, arm_compute::test::validation::num_groups, ITensorInfo::quantization_info(), arm_compute::scaled_dimensions(), TensorShape::set(), ITensorInfo::set_data_layout(), arm_compute::test::validation::set_data_layout(), ITensorInfo::set_data_type(), TensorInfo::set_quantization_info(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), NEGEMMConvolutionLayer::validate(), arm_compute::test::validation::weights_info, and arm_compute::WIDTH.

260 {
261  ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
264  weights->info(),
265  biases != nullptr ? biases->info() : nullptr,
266  output->info(),
267  conv_info,
268  weights_info,
269  dilation,
270  act_info,
271  num_groups));
272 
273  const DataType data_type = input->info()->data_type();
274  const DataLayout data_layout = input->info()->data_layout();
277  const int idx_kernels = get_data_layout_dimension_index(data_layout, DataLayoutDimension::BATCHES);
278 
279  const unsigned int kernel_width = weights->info()->dimension(idx_width);
280  const unsigned int kernel_height = weights->info()->dimension(idx_height);
281 
282  _is_prepared = weights_info.retain_internal_weights();
283  _original_weights = weights;
284  _original_output = output;
285  _is_quantized = is_data_type_quantized_asymmetric(input->info()->data_type());
286  _data_layout = data_layout;
287  _skip_im2col = (data_layout == DataLayout::NHWC && kernel_width == 1 && kernel_height == 1 && conv_info.stride().first == 1 && conv_info.stride().second == 1);
288 
289  const ITensor *gemm_input_to_use = input;
290  ITensor *gemm_output_to_use = output;
291 
292  // Get convolved dimensions
293  unsigned int conv_w = 0;
294  unsigned int conv_h = 0;
295  std::tie(conv_w, conv_h) = scaled_dimensions(input->info()->dimension(idx_width),
296  input->info()->dimension(idx_height),
297  kernel_width,
298  kernel_height,
299  conv_info,
300  dilation);
301 
302  // Check if GEMM3D is supported
303  if(data_layout == DataLayout::NHWC)
304  {
305  _skip_col2im = bool(validate_gemm3d(input->info(), weights->info(), act_info, conv_h, true));
306  // If not supported, we need to perform im2col and col2im (or reshape layer)
307  if(!_skip_col2im)
308  {
309  _skip_im2col = false;
310  }
311  }
312  else
313  {
314  _skip_col2im = false;
315  }
316 
317  // Get parameters from conv_info
318  unsigned int stride_x = 0;
319  unsigned int stride_y = 0;
320  std::tie(stride_x, stride_y) = conv_info.stride();
321 
322  unsigned int mat_weights_cols = weights->info()->dimension(idx_kernels);
323 
324  // _weights_reshaped will be auto configured in the kernel.
325  // Just append biases and do not transpose 1xW as it will be reshaped in NEGEMM
326  const ITensor *weights_to_use = weights;
327 
328  if(_weights_manager && _weights_manager->are_weights_managed(weights))
329  {
330  _reshape_weights_managed.configure(weights, nullptr);
331  weights_to_use = _weights_manager->acquire(weights, &_reshape_weights_managed);
332  }
333  else
334  {
335  _reshape_weights.configure(weights, nullptr, &_weights_reshaped);
336  weights_to_use = &_weights_reshaped;
337  }
338 
339  // Create tensor to store im2col reshaped inputs
340  if(!_skip_im2col)
341  {
342  _memory_group.manage(&_im2col_output);
343 
344  // Configure
345  _im2col_kernel = std::make_unique<NEIm2ColKernel>();
346  _im2col_kernel->configure(input, &_im2col_output, Size2D(kernel_width, kernel_height), conv_info, false, dilation);
347 
348  // Update GEMM input
349  gemm_input_to_use = &_im2col_output;
350  }
351 
352  // Create temporary GEMM output tensor in case we cannot skip col2im
353  const DataType output_data_type = data_type == DataType::BFLOAT16 ? DataType::F32 : data_type;
354  if(!_skip_col2im)
355  {
356  TensorShape shape_gemm;
357 
358  // Calculate GEMM output shape
359  shape_gemm = _im2col_output.info()->tensor_shape();
360  shape_gemm.set(0, mat_weights_cols);
361  shape_gemm.set(1, conv_w * conv_h);
362 
363  // FIXME: input->clone() doesn't work with subtensors for grouped convolutions.
364  TensorInfo info_gemm(shape_gemm, 1, output_data_type);
365  info_gemm.set_quantization_info(output->info()->quantization_info()).set_data_layout(input->info()->data_layout());
366  _gemm_output.allocator()->init(info_gemm);
367  _gemm_output_3d.allocator()->init(info_gemm);
368  _memory_group.manage(&_gemm_output);
369 
370  // Update GEMM output
371  gemm_output_to_use = &_gemm_output;
372  }
373  else
374  {
375  TensorInfo out_info{ *output->info() };
376  out_info.set_data_type(output_data_type).set_data_layout(input->info()->data_layout());
377  _gemm_output.allocator()->init(out_info);
378  _gemm_output_3d.allocator()->init(out_info);
379  _memory_group.manage(&_gemm_output);
380 
381  // Update GEMM output
382  gemm_output_to_use = &_gemm_output_3d;
383  }
384 
385  // Configure GEMM
386  // In case we need to skip col2im, GEMM3D (gemm_3d_depth != 0) must be called in order to avoid reshaping the output matrix
387  const unsigned int gemm_3d_depth = _skip_col2im ? conv_h : 0;
388  configure_mm(gemm_input_to_use, weights_to_use, biases, gemm_output_to_use, act_info, gemm_3d_depth);
389 
390  if(!_skip_im2col)
391  {
392  _im2col_output.allocator()->allocate();
393  }
394 
395  if(!_skip_col2im)
396  {
397  if(_data_layout == DataLayout::NCHW)
398  {
399  // Configure col2im
400  _col2im_kernel = std::make_unique<NECol2ImKernel>();
401  _col2im_kernel->configure(gemm_output_to_use, output, Size2D(conv_w, conv_h));
402  }
403  else
404  {
405  // Configure reshape layer
406  _reshape_layer.configure(gemm_output_to_use, output);
407  }
408  }
409  else
410  {
411  // Configure reshape layer
412  _reshape_layer.configure(gemm_output_to_use, output);
413  }
414 
415  if(_is_quantized && !_skip_col2im)
416  {
417  _tmp_output.allocator()->allocate();
418  }
419 
420  _gemm_output.allocator()->allocate();
421 
422  ARM_COMPUTE_ERROR_ON_MSG((output->info()->dimension(idx_width) != conv_w) || (output->info()->dimension(idx_height) != conv_h),
423  "Output shape does not match the expected one");
424 }
void init(const TensorAllocator &allocator, const Coordinates &coords, TensorInfo &sub_info)
Shares the same backing memory with another tensor allocator, while the tensor info might be differen...
1 channel, 1 F32 per channel
const DataLayout data_layout
Definition: Im2Col.cpp:151
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const WeightsInfo &weights_info=WeightsInfo(), const Size2D &dilation=Size2D(1U, 1U), const ActivationLayerInfo &act_info=ActivationLayerInfo(), unsigned int num_groups=1)
Static function to check if given info will lead to a valid configuration of NEGEMMConvolutionLayer.
std::pair< unsigned int, unsigned int > scaled_dimensions(int width, int height, int kernel_width, int kernel_height, const PadStrideInfo &pad_stride_info, const Size2D &dilation=Size2D(1U, 1U))
Returns expected width and height of output scaled tensor depending on dimensions rounding mode...
Definition: Utils.cpp:419
TensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: Tensor.cpp:48
ITensorInfo * info() const override
Interface to be implemented by the child class to return the tensor&#39;s metadata.
Definition: Tensor.cpp:33
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
16-bit brain floating-point number
bool are_weights_managed(const ITensor *weights)
Check if the weights are managed.
const DataType data_type
Definition: Im2Col.cpp:150
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo & set_data_layout(const DataLayout &data_layout)=0
Set the data layout of the tensor.
#define ARM_COMPUTE_ERROR_ON_MSG(cond, msg)
Definition: Error.h:456
const unsigned int num_groups
Definition: Im2Col.cpp:153
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
virtual ITensorInfo & set_data_type(DataType data_type)=0
Set the data type to the specified value.
Num samples, channels, height, width.
src_info set_data_layout(data_layout)
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1190
Num samples, height, width, channels.
void configure(const ITensor *input, ITensor *output)
Initialise the kernel&#39;s inputs and outputs.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
void configure(const ITensor *weights, const ITensor *biases, ITensor *output)
Set the input and output tensors.
DataType
Available data types.
Definition: Types.h:77
DataLayout
[DataLayout enum definition]
Definition: Types.h:120
TensorShape & set(size_t dimension, size_t value, bool apply_dim_correction=true, bool increase_dim_unit=true)
Accessor to set the value of one of the dimensions.
Definition: TensorShape.h:79
ITensor * acquire(const ITensor *weights, ITransformWeights *weights_transform)
Acquire the requested reshape tensor of the selected weights.

◆ operator=() [1/2]

NEGEMMConvolutionLayer& operator= ( const NEGEMMConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEGEMMConvolutionLayer& operator= ( NEGEMMConvolutionLayer &&  )
delete

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 618 of file NEGEMMConvolutionLayer.cpp.

References TensorAllocator::allocate(), Tensor::allocator(), IWeightsManager::are_weights_managed(), TensorAllocator::free(), ITensor::is_used(), ITensor::mark_as_unused(), NEGEMM::prepare(), NEGEMMLowpMatrixMultiplyCore::prepare(), IWeightsManager::run(), and NEConvolutionLayerReshapeWeights::run().

Referenced by NEGEMMConvolutionLayer::run().

619 {
620  if(!_is_prepared)
621  {
622  if(_weights_manager && _weights_manager->are_weights_managed(_original_weights))
623  {
624  _weights_manager->run(_original_weights, &_reshape_weights_managed);
625  }
626  else
627  {
628  // Run weights reshaping and mark original weights tensor as unused
629  _weights_reshaped.allocator()->allocate();
630  _reshape_weights.run();
631  _original_weights->mark_as_unused();
632  }
633 
634  // Prepare GEMM
635  _is_quantized ? _mm_gemmlowp.prepare() : _mm_gemm.prepare();
636  if(!_weights_reshaped.is_used())
637  {
638  _weights_reshaped.allocator()->free();
639  }
640 
641  _is_prepared = true;
642  }
643 }
void prepare() override
Prepare the function for executing.
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:163
TensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: Tensor.cpp:48
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:168
bool are_weights_managed(const ITensor *weights)
Check if the weights are managed.
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
void free() override
Free allocated CPU memory.
void run() override
Run the kernels contained in the function.
void prepare() override
Prepare the function for executing.
Definition: NEGEMM.cpp:359
ITensor * run(const ITensor *weights, ITransformWeights *weights_transform)
Run the reshape function.

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 566 of file NEGEMMConvolutionLayer.cpp.

References Tensor::allocator(), BorderSize::bottom, ITensor::buffer(), Window::DimY, ITensorInfo::extend_padding(), TensorAllocator::free(), Scheduler::get(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, TensorAllocator::import_memory(), ITensor::info(), Tensor::info(), arm_compute::NCHW, ITensorInfo::padding(), NEGEMMConvolutionLayer::prepare(), NEReshapeLayer::run(), NEGEMM::run(), NEGEMMLowpMatrixMultiplyCore::run(), IScheduler::schedule(), and BorderSize::top.

567 {
568  prepare();
569 
570  MemoryGroupResourceScope scope_mg(_memory_group);
571 
572  bool out_has_padding = _skip_col2im && (_original_output->info()->padding().bottom != 0 || _original_output->info()->padding().top != 0);
573 
574  if(!_skip_im2col)
575  {
576  // Run input reshaping
577  unsigned int y_dim = get_data_layout_dimension_index(_data_layout, DataLayoutDimension::HEIGHT);
578  NEScheduler::get().schedule(_im2col_kernel.get(), y_dim);
579  }
580 
581  // Handle the case where output has top/bottom padding
582  const ITensor *out_to_use = out_has_padding ? &_gemm_output : _original_output;
583  _gemm_output_3d.info()->extend_padding(out_to_use->info()->padding());
584  _gemm_output_3d.allocator()->import_memory(out_to_use->buffer());
585 
586  // Runs NEGEMM or NEGEMMLowpMatrixMultiplyCore functions
587  if(_is_quantized)
588  {
589  // Run gemmlowp
590  _mm_gemmlowp.run();
591  }
592  else
593  {
594  // Run gemm
595  _mm_gemm.run();
596  }
597 
598  // Reshape output matrix
599  if(!_skip_col2im)
600  {
601  if(_data_layout == DataLayout::NCHW)
602  {
603  NEScheduler::get().schedule(_col2im_kernel.get(), Window::DimY);
604  }
605  else
606  {
607  _reshape_layer.run();
608  }
609  }
610  else if(out_has_padding)
611  {
612  _reshape_layer.run();
613  }
614 
615  _gemm_output_3d.allocator()->free();
616 }
unsigned int top
top of the border
Definition: Types.h:375
unsigned int bottom
bottom of the border
Definition: Types.h:377
TensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: Tensor.cpp:48
ITensorInfo * info() const override
Interface to be implemented by the child class to return the tensor&#39;s metadata.
Definition: Tensor.cpp:33
void run() override
Run the kernels contained in the function.
Definition: NEGEMM.cpp:309
void run() override
Run the kernels contained in the function.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
virtual PaddingSize padding() const =0
Padding of tensor.
void free() override
Free allocated CPU memory.
Num samples, channels, height, width.
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
virtual void schedule(ICPPKernel *kernel, const Hints &hints)=0
Runs the kernel in the same thread as the caller synchronously.
Status import_memory(void *memory)
Import an existing memory as a tensor&#39;s backing memory.
void prepare() override
Prepare the function for executing.
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
virtual bool extend_padding(const PaddingSize &padding)=0
Update the offset to the first element, the strides and the total size.
void run() override
Run the kernels contained in the function.
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:94

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
const WeightsInfo weights_info = WeightsInfo(),
const Size2D dilation = Size2D(1U, 1U),
const ActivationLayerInfo act_info = ActivationLayerInfo(),
unsigned int  num_groups = 1 
)
static

Static function to check if given info will lead to a valid configuration of NEGEMMConvolutionLayer.

Parameters
[in]inputSource tensor info. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/BFLOAT16/F16/F32.
[in]weightsWeights tensor info. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL/BFLOAT16/F16/F32.
[in]biasesBiases tensor info. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match input data type, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[in]outputDestination tensor info. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]weights_infoSpecifies if the weights tensor has been reshaped with NEWeightsReshapeKernel. If this is not part of the fully connected layer the weights tensor has also been transposed with NEGEMMTranspose1xWKernel. Data type supported: Same as input.
[in]dilation(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).
[in]act_info(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU supported.
[in]num_groups(Optional) Number of groups when performing a grouped convolution. num_groups != 1 is not supported
Returns
a status

Definition at line 426 of file NEGEMMConvolutionLayer.cpp.

References WeightsInfo::are_reshaped(), ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::BATCHES, arm_compute::BFLOAT16, arm_compute::CHANNEL, arm_compute::misc::shape_calculator::compute_weights_reshaped_shape(), arm_compute::test::validation::conv_info, arm_compute::test::validation::data_layout, ITensorInfo::data_layout(), arm_compute::test::validation::data_type, ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, arm_compute::test::validation::input, arm_compute::is_data_type_quantized_asymmetric(), arm_compute::NCHW, arm_compute::NHWC, ITensorInfo::num_dimensions(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8_PER_CHANNEL, ITensorInfo::quantization_info(), arm_compute::S32, arm_compute::scaled_dimensions(), TensorShape::set(), arm_compute::test::validation::set_data_layout(), TensorInfo::set_quantization_info(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), NEConvolutionLayerReshapeWeights::validate(), NECol2ImKernel::validate(), NEIm2ColKernel::validate(), and arm_compute::WIDTH.

Referenced by NEGEMMConvolutionLayer::configure(), and NEConvolutionLayer::validate().

428 {
429  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(input, weights, output);
430  ARM_COMPUTE_RETURN_ERROR_ON_MSG(weights_info.are_reshaped(), "Weights already reshaped are not supported!");
434  ARM_COMPUTE_RETURN_ERROR_ON_MSG(num_groups > 1, "Grouping (num_groups != 1) is not supported on Neon");
435 
436  const DataLayout data_layout = input->data_layout();
437  const DataType data_type = input->data_type();
440  const int idx_channel = get_data_layout_dimension_index(data_layout, DataLayoutDimension::CHANNEL);
441  const int idx_kernels = get_data_layout_dimension_index(data_layout, DataLayoutDimension::BATCHES);
442 
443  const unsigned int kernel_width = weights->dimension(idx_width);
444  const unsigned int kernel_height = weights->dimension(idx_height);
445 
446  TensorInfo im2col_reshaped_info{};
447  TensorInfo info_gemm{};
448  TensorInfo tmp_info{};
449  TensorInfo weights_reshaped_info{};
450  const ITensorInfo *gemm_input_to_use = input;
451  const ITensorInfo *gemm_output_to_use = output;
452  const ITensorInfo *weights_to_use = weights;
453 
454  const bool append_bias = false;
455  const bool is_quantized = is_data_type_quantized_asymmetric(data_type);
456  const bool is_bf16 = data_type == DataType::BFLOAT16;
457  bool skip_im2col = (data_layout == DataLayout::NHWC && kernel_width == 1 && kernel_height == 1 && conv_info.stride().first == 1 && conv_info.stride().second == 1);
458 
459  // Get convolved dimensions
460  unsigned int conv_w = 0;
461  unsigned int conv_h = 0;
462 
463  std::tie(conv_w, conv_h) = scaled_dimensions(input->dimension(idx_width),
464  input->dimension(idx_height),
465  kernel_width,
466  kernel_height,
467  conv_info,
468  dilation);
469 
470  // Check if GEMM3D is supported
471  bool skip_col2im = false;
472  if(data_layout == DataLayout::NHWC)
473  {
474  skip_col2im = bool(validate_gemm3d(input, weights, act_info, conv_h, true));
475  // If not supported, we need to perform im2col and col2im (or reshape layer)
476  if(!skip_col2im)
477  {
478  skip_im2col = false;
479  }
480  }
481 
482  if(skip_col2im)
483  {
484  // If not supported, we need to perform im2col and col2im (or reshape layer)
485  if(!bool(validate_gemm3d(input, weights, act_info, conv_h, skip_im2col)))
486  {
487  skip_im2col = false;
488  skip_col2im = false;
489  }
490  }
491 
492  ARM_COMPUTE_RETURN_ERROR_ON(weights->dimension(idx_channel) != input->dimension(idx_channel));
493  ARM_COMPUTE_RETURN_ERROR_ON(weights->num_dimensions() > 4);
494 
495  // Validate biases
496  if(biases != nullptr)
497  {
498  if(is_quantized)
499  {
501  }
502  else if(is_bf16)
503  {
505  }
506  else
507  {
509  }
510  ARM_COMPUTE_RETURN_ERROR_ON(biases->dimension(0) != weights->dimension(idx_kernels));
511  ARM_COMPUTE_RETURN_ERROR_ON(biases->num_dimensions() > 1);
512  }
513 
514  unsigned int mat_weights_cols = weights->dimension(idx_kernels);
515  unsigned int mat_weights_rows = weights->dimension(idx_width) * weights->dimension(idx_height) * weights->dimension(idx_channel);
516 
517  // Output tensor auto inizialization if not yet initialized
519  weights_reshaped_info = TensorInfo(compute_weights_reshaped_shape(*weights, append_bias), 1, data_type);
520  weights_reshaped_info.set_quantization_info(weights->quantization_info());
521  weights_to_use = &weights_reshaped_info;
522 
523  if(!skip_im2col)
524  {
525  // Create tensor info for im2col reshaped inputs
526  // For Neon the batch size is on the fourth dimension
527  // TODO (giaiod01): Auto-initialize the output shape of im2col COMPMID-1482
528  TensorShape shape_im2col = input->tensor_shape();
529  shape_im2col.set(0, mat_weights_rows);
530  shape_im2col.set(1, conv_w * conv_h);
531  shape_im2col.set(2, 1);
532 
533  im2col_reshaped_info = TensorInfo(shape_im2col, 1, data_type);
534  im2col_reshaped_info.set_quantization_info(input->quantization_info());
535 
536  ARM_COMPUTE_RETURN_ON_ERROR(NEIm2ColKernel::validate(input, &im2col_reshaped_info, Size2D(kernel_width, kernel_height), conv_info, append_bias, dilation));
537  gemm_input_to_use = &im2col_reshaped_info;
538  }
539 
540  // Create temporary GEMM output tensor in case we cannot skip col2im
541  const DataType output_data_type = data_type == DataType::BFLOAT16 ? DataType::F32 : data_type;
542  if(!skip_col2im)
543  {
544  TensorShape shape_gemm = gemm_input_to_use->tensor_shape();
545  shape_gemm.set(0, mat_weights_cols);
546  shape_gemm.set(1, conv_w * conv_h);
547  info_gemm = TensorInfo(shape_gemm, 1, output_data_type);
548  }
549  else
550  {
551  info_gemm = TensorInfo(output->tensor_shape(), 1, output_data_type);
552  }
553  info_gemm.set_quantization_info(output->quantization_info()).set_data_layout(input->data_layout());
554  gemm_output_to_use = &info_gemm;
555  ARM_COMPUTE_RETURN_ON_ERROR(validate_mm(gemm_input_to_use, weights_to_use, biases, gemm_output_to_use, act_info, skip_col2im ? conv_h : 0, skip_im2col));
556 
557  // Validate Col2Im/ReshapeLayer
558  if(!skip_col2im && (data_layout == DataLayout::NCHW))
559  {
560  ARM_COMPUTE_RETURN_ON_ERROR(NECol2ImKernel::validate(gemm_output_to_use, output, Size2D(conv_w, conv_h)));
561  }
562 
563  return Status{};
564 }
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(...)
Definition: Validate.h:494
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
1 channel, 1 F32 per channel
const DataLayout data_layout
Definition: Im2Col.cpp:151
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
1 channel, 1 F16 per channel
std::pair< unsigned int, unsigned int > scaled_dimensions(int width, int height, int kernel_width, int kernel_height, const PadStrideInfo &pad_stride_info, const Size2D &dilation=Size2D(1U, 1U))
Returns expected width and height of output scaled tensor depending on dimensions rounding mode...
Definition: Utils.cpp:419
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
1 channel, 1 S32 per channel
static Status validate(const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of NEConvolutionLayerReshap...
16-bit brain floating-point number
const DataType data_type
Definition: Im2Col.cpp:150
quantized, asymmetric fixed-point 8-bit number unsigned
const unsigned int num_groups
Definition: Im2Col.cpp:153
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const Size2D &convolved_dims)
Static function to check if given info will lead to a valid configuration of NECol2ImKernel.
Num samples, channels, height, width.
src_info set_data_layout(data_layout)
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1190
quantized, symmetric per channel fixed-point 8-bit number
TensorShape compute_weights_reshaped_shape(const ITensorInfo &weights, bool has_bias=false, unsigned int num_groups=1)
Calculate the reshaped shape of the weights.
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
Num samples, height, width, channels.
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:792
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond, msg)
If the condition is true, an error is returned.
Definition: Error.h:244
quantized, asymmetric fixed-point 8-bit number signed
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
DataType
Available data types.
Definition: Types.h:77
DataLayout
[DataLayout enum definition]
Definition: Types.h:120
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const Size2D &kernel_dims, const PadStrideInfo &conv_info, bool has_bias, const Size2D &dilation=Size2D(1U, 1U), unsigned int num_groups=1)
Static function to check if given info will lead to a valid configuration of NEIm2ColKernel.

The documentation for this class was generated from the following files: