Compute Library
 19.08
NEFuseBatchNormalizationKernel Class Reference

OpenNE kernel to fuse the batch normalization node to a preceding convolution node. More...

#include <NEFuseBatchNormalizationKernel.h>

Collaboration diagram for NEFuseBatchNormalizationKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEFuseBatchNormalizationKernel ()
 Default constructor. More...
 
 NEFuseBatchNormalizationKernel (const NEFuseBatchNormalizationKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEFuseBatchNormalizationKerneloperator= (const NEFuseBatchNormalizationKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEFuseBatchNormalizationKernel (NEFuseBatchNormalizationKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEFuseBatchNormalizationKerneloperator= (NEFuseBatchNormalizationKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NEFuseBatchNormalizationKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input_weights, const ITensor *bn_mean, const ITensor *bn_var, ITensor *fused_weights, ITensor *fused_bias, const ITensor *input_bias=nullptr, const ITensor *bn_beta=nullptr, const ITensor *bn_gamma=nullptr, float epsilon=0.001f, FuseBatchNormalizationType fbn_type=FuseBatchNormalizationType::CONVOLUTION)
 Set the source, destination of the kernel. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input_weights, const ITensorInfo *bn_mean, const ITensorInfo *bn_var, const ITensorInfo *fused_weights, const ITensorInfo *fused_bias, const ITensorInfo *input_bias=nullptr, const ITensorInfo *bn_beta=nullptr, const ITensorInfo *bn_gamma=nullptr, float epsilon=0.001f, FuseBatchNormalizationType fbn_type=FuseBatchNormalizationType::CONVOLUTION)
 Static function to check if given info will lead to a valid configuration of NEFuseBatchNormalizationKernel. More...
 

Detailed Description

OpenNE kernel to fuse the batch normalization node to a preceding convolution node.

Definition at line 35 of file NEFuseBatchNormalizationKernel.h.

Constructor & Destructor Documentation

◆ NEFuseBatchNormalizationKernel() [1/3]

Default constructor.

Definition at line 416 of file NEFuseBatchNormalizationKernel.cpp.

417  : _input_weights(nullptr), _input_bias(nullptr), _bn_mean(nullptr), _bn_var(nullptr), _bn_gamma(nullptr), _bn_beta(nullptr), _fused_weights(nullptr), _fused_bias(nullptr), _epsilon(),
418  _run_in_place_weights(false), _run_in_place_bias(false), _func(nullptr)
419 {
420 }

◆ NEFuseBatchNormalizationKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEFuseBatchNormalizationKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEFuseBatchNormalizationKernel()

Default destructor.

Member Function Documentation

◆ configure()

void configure ( const ITensor input_weights,
const ITensor bn_mean,
const ITensor bn_var,
ITensor fused_weights,
ITensor fused_bias,
const ITensor input_bias = nullptr,
const ITensor bn_beta = nullptr,
const ITensor bn_gamma = nullptr,
float  epsilon = 0.001f,
FuseBatchNormalizationType  fbn_type = FuseBatchNormalizationType::CONVOLUTION 
)

Set the source, destination of the kernel.

Parameters
[in]input_weightsInput weights tensor for convolution or depthwise convolution layer. Data type supported: F16/F32. Data layout supported: NCHW, NHWC
[in]bn_meanBatch normalization layer mean tensor. Same as input_weights
[in]bn_varBatch normalization layer variance tensor. Same as input_weights
[out]fused_weights(Optional) Output fused weights tensor. It can be a nullptr in case of in-place computation. Same as input_weights
[out]fused_bias(Optional) Output fused bias tensor. It can be a nullptr in case of in-place computation and input_bias != nullptr. Same as input_weights
[in]input_bias(Optional) Input bias tensor for convolution or depthwise convolution layer. It can be a nullptr in case the bias tensor is not required. Same as input_weights
[in]bn_beta(Optional) Batch normalization layer beta tensor. It can be a nullptr in case the beta tensor is not required. Same as input_weights
Note
if nullptr, bn_beta is set to 0.0
Parameters
[in]bn_gamma(Optional) Batch normalization layer gamma tensor. It can be a nullptr in case the gamma tensor is not required. Same as input_weights
Note
if nullptr, bn_gamma is set to 1.0
Parameters
[in]epsilon(Optional) Batch normalization layer epsilon parameter. Defaults to 0.001f.
[in]fbn_type(Optional) Fused batch normalization type. Defaults to CONVOLUTION.

Definition at line 422 of file NEFuseBatchNormalizationKernel.cpp.

426 {
427  ARM_COMPUTE_ERROR_ON_NULLPTR(input_weights, bn_mean, bn_var);
428 
429  _input_weights = input_weights;
430  _input_bias = input_bias;
431  _bn_mean = bn_mean;
432  _bn_var = bn_var;
433  _bn_beta = bn_beta;
434  _bn_gamma = bn_gamma;
435  _fused_weights = fused_weights;
436  _fused_bias = fused_bias;
437  _epsilon = epsilon;
438 
439  _run_in_place_weights = (fused_weights == nullptr) || (fused_weights == input_weights);
440  _run_in_place_bias = (fused_bias == nullptr) || (input_bias != nullptr && fused_bias == input_bias);
441 
442  // Auto initialize outputs
443  if(_fused_weights != nullptr)
444  {
445  // Output tensor auto initialization if not yet initialized
446  auto_init_if_empty(*_fused_weights->info(), *_input_weights->info()->clone());
447  fused_weights->info()->set_valid_region(input_weights->info()->valid_region());
448  }
449  if(_fused_bias != nullptr)
450  {
451  // Output tensor auto initialization if not yet initialized
452  auto_init_if_empty(*_fused_bias->info(), *_bn_mean->info()->clone());
453  _fused_bias->info()->set_valid_region(bn_mean->info()->valid_region());
454  }
455 
456  // Validate arguments
457  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input_weights->info(), bn_mean->info(), bn_var->info(),
458  (fused_weights != nullptr) ? fused_weights->info() : nullptr,
459  (fused_bias != nullptr) ? fused_bias->info() : nullptr,
460  (input_bias != nullptr) ? input_bias->info() : nullptr,
461  (bn_beta != nullptr) ? bn_beta->info() : nullptr,
462  (bn_gamma != nullptr) ? bn_gamma->info() : nullptr,
463  epsilon, fbn_type));
464 
465  // Configure kernel window
466  Window win = calculate_max_window(*input_weights->info());
467  INEKernel::configure(win);
468 
469  // Configure function
470  static std::map<std::string, FuseBatchNormFunction *> map_function =
471  {
472  { "fused_batch_normalization_conv_NHWC_F32", &fused_batch_normalization_conv<wrapper::traits::neon_vector<float, 4>> },
473  { "fused_batch_normalization_conv_NCHW_F32", &fused_batch_normalization_conv<wrapper::traits::neon_vector<float, 4>> },
474  { "fused_batch_normalization_dwc_NHWC_F32", &fused_batch_normalization_dwc_nhwc<wrapper::traits::neon_vector<float, 4>> },
475  { "fused_batch_normalization_dwc_NCHW_F32", &fused_batch_normalization_dwc_nchw<wrapper::traits::neon_vector<float, 4>> },
476 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
477  { "fused_batch_normalization_conv_NHWC_F16", &fused_batch_normalization_conv<wrapper::traits::neon_vector<float16_t, 8>> },
478  { "fused_batch_normalization_conv_NCHW_F16", &fused_batch_normalization_conv<wrapper::traits::neon_vector<float16_t, 8>> },
479  { "fused_batch_normalization_dwc_NHWC_F16", &fused_batch_normalization_dwc_nhwc<wrapper::traits::neon_vector<float16_t, 8>> },
480  { "fused_batch_normalization_dwc_NCHW_F16", &fused_batch_normalization_dwc_nchw<wrapper::traits::neon_vector<float16_t, 8>> },
481 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
482  };
483 
484  std::string function_to_call("fused_batch_normalization_");
485  function_to_call += fbn_type == FuseBatchNormalizationType::CONVOLUTION ? "conv_" : "dwc_";
486  function_to_call += string_from_data_layout(_input_weights->info()->data_layout());
487  function_to_call += "_";
488  function_to_call += string_from_data_type(_input_weights->info()->data_type());
489 
490  auto it = map_function.find(function_to_call);
491 
492  if(it != map_function.end())
493  {
494  _func = it->second;
495  }
496 }
virtual DataType data_type() const =0
Data type used for each element of the tensor.
constexpr float epsilon
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps=Steps(), bool skip_border=false, BorderSize border_size=BorderSize())
Calculate the maximum window for a given tensor shape and border setting.
Definition: Helpers.cpp:28
virtual void set_valid_region(const ValidRegion &valid_region)=0
Set the valid region of the tensor.
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Definition: Helpers.inl:201
const std::string & string_from_data_type(DataType dt)
Convert a data type identity into a string.
Definition: Utils.cpp:144
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
const std::string & string_from_data_layout(DataLayout dl)
Convert a data layout identity into a string.
Definition: Utils.cpp:132
virtual DataLayout data_layout() const =0
Get the data layout of the tensor.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window(), ICloneable< T >::clone(), arm_compute::CONVOLUTION, ITensorInfo::data_layout(), ITensorInfo::data_type(), epsilon, ITensor::info(), ITensorInfo::set_valid_region(), arm_compute::string_from_data_layout(), arm_compute::string_from_data_type(), and ITensorInfo::valid_region().

Referenced by NEFuseBatchNormalization::configure().

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 38 of file NEFuseBatchNormalizationKernel.h.

39  {
40  return "NEFuseBatchNormalizationKernel";
41  }

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 507 of file NEFuseBatchNormalizationKernel.cpp.

508 {
512  (*_func)(_input_weights, _input_bias, _fused_weights, _fused_bias, _bn_mean, _bn_var, _bn_beta, _bn_gamma, _epsilon, window);
513 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::test::validation::info, and IKernel::window().

◆ validate()

Status validate ( const ITensorInfo input_weights,
const ITensorInfo bn_mean,
const ITensorInfo bn_var,
const ITensorInfo fused_weights,
const ITensorInfo fused_bias,
const ITensorInfo input_bias = nullptr,
const ITensorInfo bn_beta = nullptr,
const ITensorInfo bn_gamma = nullptr,
float  epsilon = 0.001f,
FuseBatchNormalizationType  fbn_type = FuseBatchNormalizationType::CONVOLUTION 
)
static

Static function to check if given info will lead to a valid configuration of NEFuseBatchNormalizationKernel.

Parameters
[in]input_weightsInput weights tensor info for convolution or depthwise convolution layer. Data type supported: F16/F32. Data layout supported: NCHW, NHWC
[in]bn_meanBatch normalization layer mean tensor info. Same as input_weights
[in]bn_varBatch normalization layer variance tensor info. Same as input_weights
[in]fused_weights(Optional) Output fused weights tensor info. It can be a nullptr in case of in-place computation. Same as input_weights
[in]fused_bias(Optional) Output fused bias tensor info. It can be a nullptr in case of in-place computation and input_bias != nullptr. Same as input_weights
[in]input_bias(Optional) Input bias tensor info for convolution or depthwise convolution layer. It can be a nullptr in case the bias tensor is not required. Same as input_weights
[in]bn_beta(Optional) Batch normalization layer beta tensor info. It can be a nullptr in case the beta tensor is not required. Same as input_weights
Note
if nullptr, bn_beta is set to 0.0
Parameters
[in]bn_gamma(Optional) Batch normalization layer gamma tensor info. It can be a nullptr in case the gamma tensor is not required. Same as input_weights
Note
if nullptr, bn_gamma is set to 1.0
Parameters
[in]epsilon(Optional) Batch normalization layer epsilon parameter. Defaults to 0.001f.
[in]fbn_type(Optional) Fused batch normalization type. Defaults to CONVOLUTION.
Returns
a status

Definition at line 498 of file NEFuseBatchNormalizationKernel.cpp.

502 {
503  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input_weights, bn_mean, bn_var, fused_weights, fused_bias, input_bias, bn_beta, bn_gamma, epsilon, fbn_type));
504  return Status{};
505 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193
constexpr float epsilon

References ARM_COMPUTE_RETURN_ON_ERROR, and epsilon.

Referenced by NEFuseBatchNormalization::validate().


The documentation for this class was generated from the following files: